ED 277 953 



CG 019 641 



AUTHOR 
TITLE 

INSTITUTION 
SPOKS AGENCf 



PUB DilTE 
NOTE 

AVAILABLE FROM 
PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



Kirby/ Douglas 

Sexuality EducatiQni A Handbook for the Evaluation of 
Programs • 

Hathtach/ Inc, , Arlingt on ^ VA. 

Center for Population Options , Washington, DC*; 
Canters for Disiass Con ±rol (DHHS/PHS)^ Atlanta, 
GA. 
84 

17ip.| For the complata raport, saa CG 019 
636^642, 

Network Publications, 1 700 Mission St., Suita 203, 
P.O* BoK 1S30| Sinta Cr^Tiz , CA 95061-1830* 
Guides - Non^Cliiiroom ^Usa (055) — Tests/Evaluation 
Instrwnants (IfiO) 



MFOl Plus Postagi. PC 
Adolescants } Elamintary 
^Evaluation Methodi; Pa: 
*Prograro Evaluation; *Ri 
Education; *SaKuaUty 



ABSTRACT 



This documant ii tha £i 
report on saMuality education* This vol 
usad and the axperianeas encountirsd in 
^Kamplary sexuality aducat ion programs < 
of tha raport* The prasent voluma discui 
of saxuality education programs; itlect 
-and eutcomes to be measurad; expirimenti 
-'-quastionnaira design; and proceduns to\ 
^«^^uastionnairas, analysing datap and usii 
focuses primarily upon the evalvatlon o: 
dn tha classr'^ om for young people but a! 
-*-<pf peer education programs , one-day coni 
I rent s , The ahap t a r s d i s cus s s iqutn t i 
lonductlng evaluation research. The appt 
ind assessment inventories conctrnad wii 
values, behavior, course evaluation, anc 
L course assessment guestionnaiie for pi 
;ables are included^ (KB) 



»t Available from EDRS. 
Secondary Education; 
*ent Child Relationships 
isearch Mathodology; *Sex 



th volume of a six^volime 

e is based on the methods 
the evaluation of the nine 
contained in the first voliune 
ssas tha need for evaluation 
on of program character istics 
l1 designs; survey methods; 
' administering 
g existing data. Tha voluroa 

saxuality education programs 
so discusses the evaluation 
erencas/ and programs for 
ly the important steps in 
ndix contains guest ionna ires 
h knowledge, attitudes and 

course impact, and includes 
rents. Thirteen figures and 



***************** ************************** 

Reproductions suHpliad by IDRS are the best that can be made * 
^ from the original document. * 



ERLC 



CP 




U.S. Of FAFTMINT OF EDUCATION 
OffiCB ef iduMtionaJ Re^arch and Impfevemeht 

EDIjCaTIONAL RESOURGEa INFORMATJON 
' CENTER (ERIC) 

'his dQeumerii has been repf^used as 
received from the person or orgihiiatisn 
onglnaflng it. 
D MifiDr Qhsnges have been made to improve 
feproduetipn quality. 

e Pointsof view Or opinions stated in this dpeih 
mfnt do not neoessarlly repreient offieia! 
OERJ position or poiioy, 



"PERMIit:ON TO RiPRODUCi THIS 
MATERIAL IN MICROFlaHE ONLY 
HASTEN GRANTf D BY 



TO TH! IDUOATIONAL HgSOURCIS 
INroRMATION ClNTiR (IRIC)." 



by Douglas Kirby 



BEST i^PY AVAILABLE 



ERIC 



Sexuality 
Education: 



A Handbook for the 
Evaluation of Programs 



Developed at Mathtech, Inc. 
by Douglas Kirby, PhD 



Network Publications, Santa Cruz, 1984 



Final report to tlia U.S, Department of Health and Human Services. 
Public Healtli Service, Centers for Disease Control 
Center for Healtli Promotion and Education, 

The opinions expressed in this report are those of the autlior(g) and do not neceisarily reflect tliose of the U.S. 
Government 

Deveioped by Matlitech. Inc.. 1401 Wilson Blvd,, Suite 930, 
Arlington. VA 22209 
Telephone: (703) 243^2210 

Part of tills research was supported by the Center for Population Options, 
2031 Florida Ave., N.W„ Washington, D.C. 20009 
Telephone (202) 387-S091 

For ordering infonnition contact 

Network Publications, a division of ETR Associates 

P,0, BoxSSOi 

Santa Cruz, CA 95061-8506 

Telephone: (408) 429-8922 

4 



PRft^ACE ■ m " 9 m m m m ■ • • « • ■ 

Baekground of This Project 
Overview of This Report 

INTRODUCTION 



■ ••■•••«•»•«■■•■ isc 



The Heed for Ivaluating BaKuality IducaUion Programs 
An Appraisal of Bes Education Evaluation 
About This Volume 



1 DEFINING THE BASIC PARAMETERS OF THE STUDY 5 

Initial Dee is ions about the Goals and Resources of the Evaluation 

Basie Decisions about Methodological Approaches 

The Role of Values in Conducting Research 

Overall Approach to Designing and Evaluating Programs 

2 IDENTIFYING lOTORTANT FEATURES* OUTCOMES* AND GOALS 

OF PROGRAMS 13 

Task li Establish Major Goals 

Task 2| Specify Behavioral Objectives Leading to These Goals 
Task 3: For Each Behavioral Objectives Specify the Necessary 

Knowledges Attitudes » and Skills 
Steps Within Each Task 

Task 4: Identify Unexpected or Undesired Effects 



F^T %%% BEI^TIHQ THE O^mKUL DESICT 

3 USING IXPERIMENTAL AND QUASI -EXPERIMENTAL DESIGNS * 21 

One-Shot Case Study 

One-Group Pretest Posttest Design 

Nonequivalent Pretest Fostteit Control Group Design 
Randomised Pretest Posttest Control Group Design 
Delayed Treatment Design 

Pretest and Hultiple Posttest Control Group Design 

Posttest^nly Control Group Design 

Alternative Tests Design 

Solomon Four Group Design 

Tune Series Design 

Summary 



ERIC 



k mm'^^rnmc utrviys ...... 



Svcv" -a V'er^us Ixperimental DeBigw 

C^: , ^ D^t^gning and Eva luatio| Prog r^^ms 



^iXl VBSl«KZII6 QUESTZORH^KES 

5 rt#i^ING THE FUNDAMENTALS OJ QUISHOWA^IHE DESIGN ^ 39 
Xii^^rtant Staps in Designing QiiistiofindL_X€S 

B#fcarminisg the Impartant Paatufei and 0^:^^tcoiis Thaitt Should be 
Measured 
true ting the ^astionnalre 
Fcatesting the Questionnaire 
Ass as sing Ra liability 
Assessing Validity 
Conclusion 

6 MEASURING FARTICIPANTS' ASSESSMmS OF TBBl PROGRAM — p SI 

Using Partieipants' Assessmetics 

Writing Questions 

Choosing Response Categorias 

7 DESIGNING KNOWLEDGE TESTS _ , 55 

Using Existing Knowledge Tests 
Selecting Fonnats 

Se lea ting tha Number of Quest looi in Eac^b Contittt A^=^ea 
Writing Questions 
Conducting an Item Ana lysis 
Rafarences 

8 DESIGNING QUESTIONNAIRES TO MlASUEE ATTJ^UDEBj VALIJES ^ 

AND FEELINGS 63 

Selecting Xmportant Attitudas asd Valuai Hesiura 
Using Scales Constructad by Othefg 
Selecting the Bast Scales 
Constructing and Fretesting tha kih§ 
References 

9 DESIGNING QUESTIONNAIRES TO MEASURE BEHAVIOR AND SKILLS , . * . , 73 

Datarmining the Important Behaviora to Meaeured 
Constructing the Questionnaire 
Pretesting the Questionnaire 
Assessing Reliability and Validity 



6 

o 

ERIC 



10 SELlCtING A SAMPLl ^ , , 

Selactlng a Braple Sise 
Improving the Randomness of a B^iple 
Improving the Rasponsa Rates 
The Sampling of Programs 
Ref erence 

11 ADMINISTERINC QUESTIONNAIRES , ^ , , , 

Obtaising Approval 

SeleQting a Test AdministratPr 

Selecting Dates 

Ensuring Voluntariness Whil^ IfleQuraging CQoperatL=^ii 

Ensuring Anonymity 

Using Identification Nmbers 

Giving Directions and Anfive ting (juestions 

Allowing Sufficient Time 

12 USING UNOBIRUSIVE IffiASUREB * . ^ 

Using Unobtrusive Heasurei tn OtliM lields 

Using Unobtrusive Measures to Heaiure Contraceptiv— # ^ 

and BTD Rates 
Using Unobtrusive Heasures to S\rilyate Other Goals 
Ref erence 



FART ?: AHALYEXHG THE IIAtA 

13 PREP^ING DATA fOR ANALYSIS ^ , 

Doing the Analysis by Hand Vet#iLi Oomputer 
Coding Questionnaire Data 
Keypunching the Data 

Setting up Keypunched Data on tKaOomputer 
Creating an SPSS or SAB Program llle 
Cleaning the Data 
Reference 

14 STATISTICAL ANALYSIS 

Kinds of Data 
Descriptive Statistics 
Inferential Statistics 
Heanlngfulness of Results 
Recoxmcended Statistics Books 

15 WRITING TOE EVALUATION REPORT 

Planning the Writing Project 

Presenting Quantitative Results 

Presenting Nonquantitative Data 

Dilemmas in Writing and PublashlEii the Rtiulti 

Suggested Readings 3 



7 



11 EV^LtrATINQ SPECIF IC KINDS OP PROGRAMS 127 

Gos^Kprehenslva PrDgrmis Lasting About a Blester 
Sh^Bf t Struetured Courses Lasting 1 or 2 Weeks 
0»^^--day Conf ejfences 
P^g— T Education Programs 
Pac— ant /Child ProgriAis 
Cp^^c lu a ions 

Kf|»WLEDGE QUESTIOHNAIRE * . * * * 137 

AICIOTDE AND VALUE INVENTORY 145 

ieales in tha Attitude and Valua Inventory ...«..«.« 150 

BIELAVIOR INVENTORY . 153 

KNa»WLEDGE, ATflTUDE* AND BlIUVVIOR QUESTIONNAIRE ^ . 161 

CO^rmSE EVALUATION . 169 

AS^^ESSHENT OF COURSE IMPACT . 173 

COtrr^SE ASSESSMENT FOR PARENTS 177 



0 



EKLC 



8 



FIGURE 1^1 Major Stages in DesigQing and ImpleroeDtisg a Program mnd 
Evaluation 



■ ■ • • 



TABLE 6-1 
TABLE 8-1 
TABLE 8*2 

FlGURl iM 
PIGUEE 14-2 
PIGUEl l4-3 
FIGURE 

FIGURE 14-5 

TABLE 1 5^1 

TABLE 

FIGURfi 15^1 
FIGURE 15^2 



Ixmples of Different Reaponae Categories i . . 

A Likert Scale to Heasura Self Este^ . . * , . 

eemantie Differential Seale Uaed to Mtasure Attitude toward 
Contraception . • • • • ^ . # « 



Knowledge Test Scores Fraaented as Original Raw Dmta « 

Pretest and Fosttest Bcor^as Ordered In Arrays , * • * , 

Knowledge Test Scores Frasented in Frgquency Distributions , 

Knowledge Test Scores Freaented as Pircentage and Cumulative 
Percentage Distributions 

Knowledge Test Scores Fresented in Grouped Frequency 
Distx tbutions , 

Mean Pretest and Posttest Scores on a 40-Item Multiple 
Choice Test « 

Mean Pretest and Posttest Scores on a 40-Item Multiple 
Choice Test for a SeKuality Education Class and Its 
Control Group « • « • • ^ « . , 



Number of People Receiving Diffarent Scores on Pretests 
and Fosttests « ^ 



■ * * * m m m m 



The Mean Knowledge Test Scores for Students on the Pretest 
and Fosttest ..... 



13 

54 
65 

68 
108 
109 
110 

111 

112 

121 

121 
123 
123 



9 

o 

ERIC 



Walter J, Gunnj Ph.D,, Director, Research and Evaluation^ developed the 
apprDach for this entire project, initiated the contracti monitored progresij and 
provided technical assistance j guidance | administration^ and support « Clearly i this 
and the other volumes would not have been possible without his continuing effort and 
support , 

Judith Alters Jesse Blatt^ Nancie Connolly | Lynne Cooper, Bernard Kirby, Guy 
Parcel, and Peter Scales carefully read the volume and made numerous helpful 
comments. Lynne Cooper made numerous substantive suggestions that were especially 
helpful and imprijved the volume. 

Ann Thompson Cook spent many hours editing this volume* She has made it much 
more clear, concise, and readable. Karen Allan provided considerable help with the 
typing and production. 



EKLC 



10 

IK 



PRlfACE 



Baekground of This Pro leet 

During £ha mid 1970's the Garter administration reGognized the large number of 
unintended teenage pregnancies In Ameriea and sought solutlona to this major 
problem. That administration recognlEed that one potentially effective solution was 
sexuality education. Consequently, It asked the Center for Health Promotion and 
Education (formerly the Buruau of Health Education) in the Centers for Disease 
Control to identifys Improvers and evaluate promising approaches to seKuality 
education. 

The current project follv^wed an earlier 1978 contract that the Center for 
Health Promotion and Education awarded to Hathteeh to identify promising programs 
and to develop evaluation methods. In that project, Mathtechi with the help of many 
sexuality educators and other related prof esslonals i 

• identified and rated about 200 features and outcomes of programs 
potentially important to reducing pregnancy and increasing psychological 
health 

• reviewed the literature on the effects of sex education programs 

• identified 10 promising progrms representing' several different approaches 

• developed questionnaires and other methods to more effectively measure the 
important outcomes of these promising programs 

• summarised the work in a six--volume report entitled An Analysis of U. S. 
Sex Edu c Ati on Prog r am s an d Eva 1 ua 1 1 on Me t hod s . 

In 1979 the Center awarded Hathteeh a second contract to help Improve and then 
evaluate 10 of the promising sexuality education programs , Hathteeh selected 10 
exemplary programs that represented a variety of different approaches to sexuality 
education. The programs include 6-hour progrms, semester programs, conferences, 
programs for young people alone and for young people and their parents together * 
peer education programs, both school and non-school progrMs, and both educational 
and clinic approaches. Hathteeh i 

• conducted an initial evaluation of each progrm 

• suggested numerous changes which the sites Incorporated 

• offered training to the program staffs 

© provided some materials and other kinds of support 

• then carefully evaluated the progrfttns. 



Thg reeuL^s of this eontraet are susmariged in this report. 

The Qry ani zation of This Repart 

The complete report contains saveral separata volumes and an EKeeutive Buffimary 
vhich sun^urises the first volumes Although all of the volumes are an integrated 
package w^^bieh we hope will meet many varied needs of educators p evaluatorsj and 
policy masters, some of the volumes will have particular interest for selected groups 
of people a and each volume is complete and amn be used independently of the others* 

Se%m jftlitv Educations An Evaluation of Frograma and Their Eff act s . , .An 

Execut iv^ ^iitnmarv sumarizes first the existing information on sexuality education 
in the Un^lted States and then the overall design^ methods, and major findings of 
this avai^sation* 

The ^irst volume * ieKualitv Educations An Evaluation of F ro^ra pa and Their 
Effects . aummarizes the structure and content of seEuality education in the United 
Ststis, reviews the literature on the effects of sexuality education, describes the 
evaiuatioia. methods « provides a description of and the evaluation data for each 
progiram, and summarizes the effectiveness of different approaches in meeting 
diffirent goals* 

The »«cond volume, SeKualitv Educa^^onr A Guide to D eveloping and Imnlementiy i^ 
Prof^rams provides suggestions for developing and implMenting effective educational 
and elin^^^based approaches to sexuality education» It discusses the reasons for 
and natur^^ of responsible seKuality education and describes approaches to building a 
conniunity'— tased program, selecting teachers and finding training, assessing needs of 
the tar gat population, and designing and implementing programs for them. It also 
provides ffi^ggestions for evaluating programs* 

The third volume. Sexuality Bducat ioni A Curriculum for Adolescents , ±m based 
upon the curricula of the most comprehensive programs. These programs increased 
knowlidga >mnd helped clarify value8« The curriculum consists of the following 
unitsf E^^troduct ion to Sexuality, Communication Skills, Anatomy and Physiology, 
Values, S^Xf Esteem, Decisionmaking, Adolescent Relationships, Adolescent Pregnancy 
and Paresi.*ing, Pregnancy Prevention, Sexually Transmitted Diseases, and Review and 
Evaluatioa:^ • Each unit contains a statement of goals and objectives, an overview of 
tha unit «ontants, several activities that address the goals and objectives, and 
wherever ^>^aded, lecture notes and handouts* 

The fourth volume , Sexualit y Education : A Curr iculum f or Parent /Child 
Proerame* based upon the parent /dhild program which increased knowledge and 

parsnt /ch :ild communication* The curriculum includes several suggested course 
outlines a^sd the following units; Introduction to Coursei Anatomy, Physiology, and 
Maturatio^ai Gender Rolesi Sexually Transmitted Diseases i Reproduction* Adolescent 
Sexuality ^ Birth Control; Parent ing | and Review. Each unit contains aevaral 
activities and, wherever necessary, lecture notes and handouts. 

This ^ifth volume. Sexuality Edu^f^ ^ont A Handbook for Evaluating Programs , is 
based upon the methods we used and our experiences in evaluating these programs. It 
dlieusaes she need for evaluation of sexuality education programs | selection of 
program c^amracterist ics and outcomes to be measured! experimental designs | survey 
methods I c^-u^astionnaire designi and procedures for administering questionnaires, 
analyzing ^ata, and using existing data. 



12 

Kil 



A sixth voluffiep SeKualltv Edueations An Annota ted Guidm for ReBouyr.^ 

Materials > rtviaws bookii films » filmitrlpSs aurrieuls» charts « madels, and gimss 
for youth in elafaentary school through high seheol* For each reseuifcif the guide 
lists the distributor^ lengthi oostp and reeommended grade level i and provides a 
discussion of the material* This volume differs from the others in that it was not 
funded hj the government and is not part of the final report* However^ it will be 
useful to people developing programs* 



13 

xlil 



ERIC 



IHTOODUCTION 



The word "evaluation" commonly refers to a variety of informal and formal, 
noniy st emat ic and systematie assesiments and judgmenta. This guide will use 
"evaluation" in its more narrow and scientific sense, as the formal and systematic 
process of collecting information about a program in order to determine the 
effectiveness of that program and to make better decisions about that program. 

There are several different models for collecting information. This guide uses 
primarily a goal— attaiiment model which generally has four major steps i 1) defining 
the measurable program goals and objectives, 2) designing methods of measuring and 
quantifying those goals and objectives, 3) collecting data that measure them, and 4) 
reaching conclusions about the extent to which the goals and objectives are reached. 

The goal^attainment model can be contrasted with a goal-»free model, which 
involves meaauring all outcomes, not just goals ^ and a systems model j which involves 
analysing costs and benefits* 

The Need for_ Evaluating _8eKuali_tY_Eji-UCAt ion ProRraiais 

As a general rule, any social activityj program, or policy designed to 
alleviate social problems should be carefully evaluated whenever 1) the program or 
practice and decisions about it are important, 2) the outcomes cannot be assessed 
without an evaluation, and 3) informal and nonsy s t emat ic observations and 
information cannot provide sufficient data for decisionmaking. Even when informal 
observations are sufficient for decisionmaking ^ programs should be systematically 
evaluated whenever other people require evidence about the success of the program. 
Without careful evaluation, ineffective practices or programs may be maintained, and 
effective practices or programs may be canceled. 

The evaluation of sexuality education programs is particularly important* 
First, sexuality education programs are designed and implemented to improve the 
lives of young people in very important ways. For example, some educators hope 
sexuality education will improve interparsonal communication , decisionmaking, 
responsibility, social relationships, and self esteem^ and that it will reduce 
unwanted sexual activity 5 unprotected intercourse, unwanted pregnancies, sexually 
transmitted diseases, rape, and some sexual dysfunctions* To achieve such outcomes, 
substantial and increasing amounts of time^ money, and other resources have been 
devoted to sexuality education. 

Second, nonsystematic observations and past research have left unanswered many 
important questions about sexuality education programs* 

® What are the long term effects of sexuality education? 

e How does it affect students' attitudes and behaviors? 

0 Does it reduce unwanted pregnancy and sexually transmitted diseases? 

© Does it improve young people^ s communication with parents? 



' 14 



ERIC 



• Are shorter programs more cost effective than semester programs or vice 
versa? 

• Are separate courses more effective than units which are part of other 
courses (e.g., a sexuality unit within a science course)? 

• ^at topics are most important? 

• What characteristics of teachers are most important? 

• What kinds of activities lectures, discussions, role=playing, films ~ 
are most effective? 

These and other questions remain unanswered. Moreover j when these questions 
are answered about seKuality education in general, similar questions about 
individual programs with specific structures (e.g., number of sessions and length), 
specific personnel, and specific materials will still remain. Effectiveness should 
be determined in each case. 

Third, sexuality education has been the subject of heated controversy 
throughout the country* Opponents claim that seKuality education has many negative 
effects; proponents claim that it has many positive effects. Neither side can prove 
its case. Well validated research can eventually help calm these conflicts. 

Fourth, program providers often need the clarity and realism that evaluation 
produces. Too many programs have vague and unrealistic goals. As staff members 
anticipate the evaluations, or become involved in them, they become both more clear 
and more realistic about the program and its goals, ThuSp the mere process of 
evaluating programs can help improve the programs. 

Both the mportance of sexuality education and the need for evaluation are 
demonstrated by the many people who are currently asking important and difficult 
questions about sexuality education. Each month reporters from newspapers, 
magazines, and radio and television stations request information on the amount of 
sexuality educ^ition in schools, the comprehensiveness of programs , and the effects 
of programs. Each month several Congressional representatives request information 
about the effects of sexuality education progrOTs. They ask whether programs reduce 
unwanted pregnancies ^ increase self esteem, and improve the psychological health of 
adolescents* Each month educators ask about the evidence for the success of 
programs and the realism of meeting expected goals* Unfortunately, most of these 
questions and requests for information cannot be adequately answered because the 
necessary research has not been conducted or completed. 

The importance of evaluating sexuality education programs is further 
exemplified by the surprising results of evaluating programs in other fields* For 
example, many states, observing that teenagers were involved in a disproportionate 
number of automobile accidents, developed drivers' education progrms for them* The 
educators and others believed that such programs would increase the students^ 
knowledge about safety, make them more responsible, increase their driving skills, 
and consequently reduce their number of accidents. However, several recent studies 
demonstrate that drivers' education helps teenagers drive at an earlier age and, 
thereforej it may ultimately increase the number of accidents and deaths among 
teenagers. These results are just the opposite of expectations. 

You should not conclude from the example above that if drivers' education 
increases accidents, then sexuality education will increase sexual activity and 
pregnancies* There is a critical difference between the two programs drivers' 
education is designed to teach students to drive^ but sexuality education is NOT 
designed to teach young people to have sex. However, the example does demonstrate 
the importance of making sure that our important social and educational programs, 



including sexuality educations are having the effect(s) we intend* 



Ajv AppraisaJ. of Se_K_ Education Evaluation 

Many evaluations of aeKuality education progrMS have not been true evaluationa 
— - they have described the programs , but have not earefully evaluated the effects of 
the programs « 

Of those evaluations that have examined the effects of the programs most have 
employed some type of experimental or quasi^ experimental design. In such studies s 
the sexuality education class is considered the experimental groups and oceasioually 
some other class is treated as the control group- Evaluators then give 
questionnaires both before and after the course to both the experimental and control 
subjects. This kind of design can provide good evidence for the effects of the 
course . 

Unf or tunat ely s there are numerous limitations with the evaluations that have 
employed this design: 

0 Many studies have evaluated single programs which may or may not be 
representative of all sexuality education programss and thus it is 
difficult to generalise from them to other courses. 

• Because evaluators have rarely been able to randomly assign students to 
experimental and control groupSs some self ^selection factors may have 
affected their results* 

e Very few evaluations have measured effects beyond the end of the program* 

• Most questionnaires focused upon knowledge and failed to measure many 
important attitudes and behaviors* 

• Many questionnaires have been poorly designed* 

s Many evaluations reported the statistical significance of the change in 
students s but few evaluations reported the magnitude of the change and its 
theoretical or practical significance* 

Fortunatelys during the last few years s an increasing number of people have 
been recognising the need for evaluations and there has been considerable growth in 
the evaluation of sexuality education* Research groups are developing and 
di saeminat i ng new evaluation materials; professional organizations are offering 
special sessions or seminars on evaluation; a few research groups such as E.T.R, 
AssociateSs Mathtechj and Johns Hopkins University are conducting more formal 
evaluations of programs, and an increasing number of schools s clinics s and other 
youth-^serving organi sat i ons are actually evaluating their programs* Thus , the 
direction ia positivej but the need will be met only with considerable effort for 
many years* 



Ab o u t Thl s^ Volume. 

This guide introduces methods of evaluating sexuality education programs* It 
discusses the need for evaluation; selection of progrm characteristics and outcomes 
to be measured; experimental designs ; survey methods; questionnaire design; and 
procedures for administering ques ti onnair es s analyzing data, and using existing 
data* It provides both fundamental principles and practical suggestions for 
evaluation* In the appendix are reliables valid questionnaires that have been used 
in the evaluation of sexuality education programs. 



1 



6 



erJc 



The volume focuses primarily upon the evaluation of iexuality education 
programs in the classroom for young people but also discusses the evaluation of peer 
education progfams, one-day conferences, and progrms for parents. Educators can 
apply the same principles and methods to other kinds of progr^s. 

Much of the volume is written in sufficient detail for the lay person with only 
a beginning knowledge of evaluation methods, but contains numerous practical 
suggestions and several sections that should be helpful to the more advanced 
methodologist as well. 

The chapters in this guide discuss sequentially the important steps in 
conducting evaluation research. Many aspects of a good design are interdependent 5 
however f a© that one must continually think ahead to subsequent steps when raaking 
decisions about earlier steps » 

If you have never conducted an evaluation, this volumej and evaluation more 
generally J may appear intimidating* If so, start small peruse this volume and 
conduct a small and relatively simple evaluation. Use a simple design, me/isure only 
a few outcomes, and use a small sample* Then reread this volume and improve your 
design and questionnaires. Your initial evaluation may provide useful information 
for your program and will help you learn about evaluation so that your subsequent 
evaluation efforts can be more rigorous and valid. 

If you have conducted several evaluations, you may already be familiar with 
parts of this volume. Feel free to skim those parts and to focus on those parts 
that are most informative. 




ERIC 



1 

DEf IHIHG Tm B4SZ0 FABlM4TmS OP Tm STUDY 



Evaluation is a process of systematically collecting information so that people 
can make better decisions about progrmis* Howaverj people in different positions 
need varying kinds of information to make their decisions « For aKamples staff and 
administrators may want information about the impact of the program while funders 
may prefer information on the numbers of people served and the costs of the program. 
Collecting these different kinds of information requires different methods. The 
different needs of different groups are legitimatep but you will need to establish 
priorities. Before you begin to design specific tools for collecting dataj you 
should make a number of basic decisions about the overall goals ^ scope ^ and 
structure of the evaluation. 



Initial Dec i s ig n ^ a b oii t th e Go a 1 s and Res our s of the Eva 1 ua t i o n 

Who Will Use t he Evalua tion Results? 

For any evaluation of sexuality education programs, there are several possible 
users t 

• The educators or instructors themselves 

• The administrators of the program 

• The people or agencies funding the progXM 

• Other professional educators or organisations involved with sexuality 
education and interested In putting on a program 

• Lay members of the community Interested in the program 

• Groups served by the progrm. 

Each of these groups may have legitimate j but different needs- Often their needs 
will overlap so that any evaluation will be helpful to many of them. However , you 
will have to make many decisions which will make the evaluation better suited to one 
group than another. Select the primary users and then direct the evaluation to 
them« 



What Ig the Purpp_p_e _ _o f_:t_he_ Evaluation? 

The different groups mentioned above may have different purposes for the 
evaluation. Moreover, any one group may also have more than one purpose. Users may 
want the evaluation: 

# to describe the contents of the existing program 

# to assess the impact of the program upon the participants 

9 to assess the relative effectiveness of different program components 

# to estimate the number of people being served 



5 



18 



• to aisess the total cost of the program and the cost per person served 
and/ or 

• to identify ways to improve the program. 

Once agains you will need to decide which of these different purposes take priority 
for the evaluation* 



Which Pygg^cm Cm Be Evaluated? 

Obviously, you need to know what programs or components you are going to 
evaluate before you can evaluate them, r schools j the sexuality education program 
is usually well defined. Typically £ includes the sexuality units or courses in 
the school curriculum and the instructioi is given to classes of students which meet 
at specified times in the classroom. However, nonschool programs may be less 
clearly delineated or may have many components. For example, a youth program may 
include regular group discussions at the agency, occasional parent/child activities, 
occasional films for people who drop in, outreach efforts at health fairs, and media 
public information spots* These different components may be linked in different 
ways, with some people participating in two or more components and others 
participating in only one* 

If a program has multiple components, you may want to carefully define each 
component, decide which components to evaluate, and then measure the unique 
contribution of each component* Alternatively, you may want to measure the 
cumulative and interactive effect of all the components. This will give you 
evidence for the success of your entire program, but will not help you judge the 
relative importance of each component* 

Is the Study Feasible? 

When choosing the basic parameters of the study — - the kind of experimental 
design or survey, the basic kinds and numbers of questions to be askedj the 
approximate number of people in the sample — you need to constantly keep in mind 
the resources available to your study. You must be able to answer the following 
questions af f irmatively i 

• Are the goals of the evaluation realistic? 

• Are the funds, labor, and other necessary resources available to complete 
the evaluation? 

• Is the proposed evaluation politically feasible? Will the necessary groups 
support it? Can the evaluators deal effectively with any opposition groups 
that may try to block it? 

• Is it possible to obtain the desired data? Will the proposed respondents 
to the questionnaires actually complete the questionnaires? Will they find 
the questions acceptable and not too sensitive or personal? Can they 
complete the questionnaires in a reasonable period of time? Will their 
answers be reliable and valid? 

• Are the resourses available for data analysis and report writing? 

• Can the report be disseminated appropriately? 

Two resources are particularly important i the willingness of people to 
participate In and help with the evaluation and the availability of funds for 
materials, computer time, and professional help* The willingness of participants is 
perhaps the most critical. If program participants are not willing to fully 




ERLC 



cooperate with the evaluation, then its validity will be seriously compromised. If 
there are relatively few funds, you may itill be able to complete a valid 
evaluation, but doing so will be mere difficult* Occasionally, you can obtain a 
small grant to support your research. Otherwise, you can reduce costs without 
sacrificing quality byi 

• collecting only the most important information that you need 

9 modifying and then using previously validated questionnaires instead of 
creating questionnaires from scratch 

• collecting data from only a sample of people instead of the entire 
population 

• having program participants and staff collate questionnaires, put them in 
envelopes, and latter code them 

• scoring quest ioMaires by hand, instead of using computer facilities or 
using tests which can be machine scored 

• Qbtaining statistical advice and analytic support from graduate students 
who might wish to use the evaluation as part their work toward an advanced 
degree • 



Who Will Conduct the Evaluation? 

Increasingly a wide variety of people are conducting evaluations of programs* 
They range from teachers or clinic counselors with relatively little experience in 
evaluation to professional methodo legists with substantial experience in evaluation* 
Ideally, the person conducting an evaluation would havei 

• familiarity with basic methodological concepts such as experimental designs 

• previous experience conducting evaluations 

• skills in coordinating and administering 

• freedom and ability to conduct an unbiased evaluation. 

Howaver, if you do not have previous experience, you can still collect information 
with the use of materials like this handbook and some advice from consultants. As 
Suggested in the Introduction, you can first conduct a relatively simple evaluation, 
and as you become more experienced, then improve the design, questionnaires, and 
validity of your evaluation* 



Baa i e P_e c i s_i o about. _Me t ho d o 1 qg i c a 1 Ap p r o a ch e s 

pescri ptive Versus Eva luative Information 

In evaluation, there is an important distinction between descriptive 
information and evaluative information. The former simply describes the exi sting 
program: its length, the number of hours the classes meet, the topics covered, the 
different kinds of activities used, the characteristics of the staff, the number of 
students that attend. Evaluative information describes the quality and success of 
the program by measuring its effects. 

Too often people have evaluated their programs by presenting only descriptive 
information about their programs* Unfortunately, the existence of excellent 
resources does not always assure desired effects. Therefore, if you want to measure 
the actual effectiveness of your program — * whether or not it has a desired impact 
you should collect evaluative information and actually measure the effects. 



20 



ERIC 



This handbook focuses upon methods for obtaining evaluative information* Thus, 
it will not discuss methods for describing the program components, counting the 
numbers of people served, or estimating the costs of the program, but it will 
discuss methods for helping improve the program and methods for assessing the impact 
of the progr^* 

Format i ve_ Ver sus f ugma tiye Evalua t ion 

When collecting evaluative information (as opposed to descriptive information), 
there are two kinds of evaluations formative and summative- In a formati *e 
evaluation, evaluators might lead group discussions or administer questionnaires in 
which they ask the program participants how they liked the programs what parts of 
the program they would change, and how they would improve it* Because the focus of 
formative evaluations is to give feedback as quickly as possible, such evaluations 
are commonly conducted during the course as well as at the end of the course. 
Educators who are primarily interested in improving their programs should conduct a 
formative evaluation* Such an evaluation is designed to provide more quickly the 
kinds of data educators need to improve their programs. 

When evaluators are primarily interested in measuring the effectiveness of a 
program, they should conduct a susmative evaluation* Such an evaluation focuses 
more directly upon the actual outcomes of the progrm and will thereby help other 
educators and pdlicymakers decide whether they wish to adopt this program or other 
programs. Although a sutamative evaluation can help educators improve their progrra, 
a summativa evaluation provides less direct information about specific ways to 
improve the progrma- 

Because a summative evaluation reports the success of an entire program, and 
because that report may affect decisions about the continuation of that program or 
the adoption of that program elsewhere, that evaluation is especially important, and 
the methods must be especially valid and defensible- Thus, summative are more 
likely than formative evaluations to use eKperimental or quasi^experimental designs. 

Exp er imen t a 1 _an d^ Qxxb. sJ -ato er i men t_a_l_ _P_e s iRn s Ve r s u s Sur v e v He t ho d s 

Experimental and quasi-experimental designs and surveys often involve the 
administration of questionnaires one or more times and may measure the effects of 
sexuality education programs* How, then^ do they differ? 

True experimental designs differ from quasi-^experimental designs and surveys in 
one critical respects experimental designs include the random assigiment of people 
fed the experimental and control groups. In the evaluation of sexuality education, 
the eKperimental group participates in the sexuality education program and the 
control group does not* The random assignment of people to the experimental and 
control groups will cause the two groups to be similar before the experimental group 
participates in the program. Then, if the two groups are different after the 
program, you may be able to attribute this difference to participation in the 
program* Some experimental designs also include the administration of 
questionnaires before the progrm (pretests) and after the program (posttests) , and 
thereby allow you to actually measure change. Thus, experimental designs provide 
the best evidence for the causal impact of sexuality education programs, 

Quasi-eKperimental deeigns do not include the random assignment of people to 
the experimental and control groups. However, they do have some of the other 

21 

8 

ERIC 



features of eKperimental deiigns. For axamplei they may inGiud© experimental and 
eontrol groups even though people were net randomly assigned to them, and they may 
also inelude pretests and postteits. The evidence they provide for the eausal 
impact of programs ii poorer than that of true experimental designs, but better than 
that of surveys - 

In contrastp surveys typically do not include any control over the 
participants. Respondents in the sample may have participated in no sexuality 
education program or in a variety of different seKuality education programs. 
Moreover, they may have participated in a program recently or long ago* Thus, they 
provide the poorest evidence for the impact of progrms. They nevertheless can be 
useful in obtaining additional information about programs. 

Experimental designs and survey methods are discussed fully in Chapters 3 and 4 
respectively - 



Mormati v_e_ Referenced. Versus. Cri ter ion Ref erenced Me t hods 

When teachers or researchers evaluate individuals or groups, they often give 
each person a score that is determined by that individual's performance relative to 
the performances of others. Por example, some teachers give the top 10% of the 
students in a class an "A," the next 30% a "B," Similarly, people may be given 
percentile rankings on Graduate Record Exams or other tests of knowledge or skill. 
Such scores are based upon norms established by the group, and accordingly are 
called normative referenced measures* Such scores order people; they indicate that 
one particular person is more or less capable than others but do not tell anything 
about the ribsolute capability of that individual- Consequently, they are very 
useful whenever the object is to compare individuals* For example j graduate schools 
prefer normative referenced scores to select graduate students. 

On other occasions, evaluators assign scores based upon some set of specified 
standards and not upon the relative performances of individuals, for eKample, 
people who take a driving test are given a score (pass or fail) that depends not 
upon their performance relative to others, but upon their ability to perform a set 
of specified driving tasks. Because these scores are based upon specified criteria, 
they are called criterion referenced measures. Such measures are useful in 
determining those areas in which individuals have sufficient knowledge or skills and 
those areas in which they need to improve . 

Both normative and criterion referenced questionnaires may resemble each other 
in outward appearance. However, different steps must be completed to develop them. 
For example, when developing normative referenced questionnaires to measure 
knowledge, you should specify the different areas of knowledge that you wish to 
measure and write questions for each area. Some of these questions should be easy 
and others difficult so that the more informed students are separated from the less 
informed. When developing criterion referenced questionnaires^ you should specify 
very carefully the exact knowledge facts to be known, develop questions for those 
facts, and then specify criterion levels for the percentage of questions that people 
should get correct in order to be able to perform some desired activity. When 
designing this criterion referenced questionnaire, you would have less concern about 
selecting both easy and difficult questions. 

In the past j normative referenced measures were more commonly used, because 
they were developed first, and also because they are so commonly used to grade 
students in schools and universities. In general, however, criterion referenced 



measuifes are preferable* When evaluating progrms, you will probably want to know 
whether the partieipants are learning the specif ic facts that are needed ^ and you 
will probably want to know whether they need additional help in specific areas. 
Criterion referenced measures can better provide this information* 

Note that you can often develop questionnaires as criterion referenced measures 
and use thm in normative referenced measurement (to rate individuals relative to 
one another), but you cannot do the reverse satisf aetorily . That is^ you cannot 
develop questionnaires with normative referenced measures and then use them in 
criterion referenced measuraaent (to ascertain whether specified standards have been 
met), because the specific content areas and the needed levels of competence will 
not have been specified. Thus, developing criterion referenced measures also 
provides more versatility. 



One Ver sus 1^o_or Mor e_ Met hods o f _ Qql 1 ec t lag Pat a 

An important principle in methodology is that you should use at least two 
maximally different methods to collect evidence for the success of programs and then 
compare the conclusions that logically follow from each method. This is important 
because every method of evaluation has some assumptions, some biases, and some 
inevitable sources of error. Maximal iy different methods are less likely than 
similar methods to have the same assumptions, the same biases^ and the same sources 
of error. Therefore, if two maximally different methods produce consistent 
conclusions, those conclusions are less likely to be caused by the same underlying 
assumptions, biases, and errors, and are more likely to be val3.d. If ^ on the other 
hand, conclusions derived from one method are inconsistent with those derived from a 
second method s then you know that either one or both of the methods contains some 
source of error. Being able to check the conclusions from one method against the 
conclusions from a second method can substantially increase the validity and the 
credibility of your conclusions. 

Muljti-Pi^ sour ces _Qf information . Most commonly evaluators of sex education 
programs get their most valid and complete information directly from the 
participants in the program. However, many times you can also obtain valuable 
information from the parents or teacher of the participants. Occasionally, you can 
get valid information from the school nurse and principal (if it is a school 
program), from peers, and from outside observors. 

Mu- 1 1 i p 1 e m e tji o d s _ o f i n f o r ma t i on ■ There are numerous different methods of 
collecting data. These include questionnaires, unobtrusive measures (including 
extant data), direct observations, and group discussions. 

Questionnaires can often provide the best data about programs. They have 
several advantagesi they can be anonymous; they are systematic | they can cover a 
wide variety of topics; and they can include questions about activities outside of 
the classroom. However^ they have two major disadvantages i they are self-^reports 
and they require the cooperation of the students and possibly teachers and others. 
Thus, if respondents either intentionally or unintentionally answer questions 
ineorrectlyj the resulting data are invalid. Questionnaires are discussed in detail 
in Chapters 5 through 9. 

Unobtrusive measures involve the collection of data without the knowledge or 
participation of the respondents. Thus, they overcome some of the problems of 
questionnaires. They include important kinds of data (e.g., pregnanGy or STO rates) 



erJc 



23 

10 



that may have been collected by others for other reasons. Unobtrusive measurei are 
diseussed in Chapter 12, 



FarentSs teacheri, prineipals, outside observorSj and otheri can directly 
observe ehanges in the participants. Teachers j especially, can observe changes in 
the students' knowledge^ attitudes, and comfort as they are e^ressed in classroom 
discussions, questiona, and comments • Sometimes teachers talk with individual 
students after class and help them solve their problms, and in these ways learn 
about the impact that the course has had* All of these observations can be 
valuable, but they may not always be valid. Too often, teachers or program staff 
view the events in the progrra selectively; they focus more upon those events that 
indicate the program is effective but fail to fully consider evidence that indicates 
that the program is ineffective* Even if the staff's observations are accurate and 
unbiased, they will not be viewed as solid evidence for the success of a program. 
Thus, you should consider their observations a valuable source of insights, but you 
should not use them to reach final conclusions about the success of a program in a 
sumiative evaluation » 

Group discussions can be particularly helpful in formative evaXuations, but 
they provide much leas valid data for summative evaluations* That is, they are not 
anonymous or systematici they rely upon verbally expressed selt reports; and 
consequently they are not likely to provide valid information about the actual 
effects of the program upon th^* 

In sum, no single method is best; the most valid conclusions can be obtained 
from the judicious use of two or more maximally different methods. Frequently, if 
you have to use a single method, questionnaires will provide the best evidence, but 
if you have valid pregnancy or BTD rates, they may better measure the programs' 
impact upon these rates. 



The Rol e. of Va lues_ _in_ Qonduc ti n^ Re s ear c h 

The subject of values is mn important and frequently discussed topic in 
sexuality education courses. Similarly, it is an important topic in sexuality 
education research and even more generally Inmost social science research. The 
basic questions are, "Should your basic values affect your research?" "If so, how?'* 

One answer commonly held by social scientists is simple in principle, but not 
necessarily smple to implement i Values can or should affect your choice of the 
problem you study, but once you have selected a topic of study, your values should 
not affect your analysis or conclusions. 

Practically, this means that you should consider your values when you decide 
whether or not to evaluate sexuality education programs, and perhaps, you should 
even consider your values when you think about the magnitude of that effort. Thus, 
It is perfectly acceptable and probably even preferable to consider the need and the 
costs — both economic and social " of research In sexuality education. 

Fractically, this also means that once you have defined the scope of the 
research, your desire to demonstrate that a program's success should not bias your 
research. That isj you should be coimitted to an unbiased, valid study and to the 
advancement of knowledge, but not to the demonstration of a program's success or 
failure. 



11 



24 



ERIC 



At each itsge of the evaluation proeesSj there are many overt and subtle ways 
that you ean biai the results. Por exmple, when deiigning questionnaires, you may 
be tempted to ask questions that will probably reflect well on the progrm and 
neglect to measure outcomes that will probably be negative. When administering the 
questionnaires, you may be tempted to stress the importance of getting particular 
kinds of findings and thereby encourage students to bias their answers and to give 
you answers that please you. When reporting the results^ you may be tempted to 
stress only the positive results and fail to report non--f indings or negative 
results. Thus, there are many opportunities for you to introduce your own biases, 
and you must continually guard against that. You should be careful to use 
established procedures for guarding against bias, and you might consider involving 
in your evaluation consultants, university professors, or others who can serve as 
your super-ego. 

If you allow your values and biases to enter into your evaluation process^ they 
will block your accurate understanding of sexuality education, reduce the faith that 
people have in research findings about seKuality education (and other topics), and 
partially destroy the integrity of evaluation methods. 



Overall Approach to Designing and Evaluating Progrms 

Major stages in designing and implementing a program and evaluation are 
diagrammed in Figure 1-1. 

Determining the desired features and outcomes of programs before designing 
either the program or the evaluation is very important. The specified features and 
outcomes should guide the design of both the program and the evaluation, fit 
between the two helps ensure that the program and evaluation are striving to achieve 
and evaluate the same set of goals and objectives. Fit helps prevent people from 
designing a program with one set of goals in mind or with goals loosely defined ^ 
then claiming during the evaluation that additional goals are important even though 
the program was not designed to achieve those goals ^ and then learning that the 
program did not meet these additional goals and is not effective. 

Of course^ many people often design programs, implement them^ observe problras, 
modify the program, and only then think about evaluating the program. Although 
designing the program and the evaluation methods simultaneously is better, it is 
possible to design valid evaluations well after programs have been implemented. In 
this case, you should follow the bottom set of tasks in Figure 1^1^ and you must be 
sure that the stated goals really are the goals of the progrm. 

After specifying the major goals and objectives of the program^ you need to 
design the basic structure of both the program and the evaluation. For example, 
when designing the basic structure of the program, you should consider the needed 
length* the number of sessions, the participants, etc. When you design the basic 
structure of the evaluation* you should answer the questions discussed in this 
chapter about descriptive versus evaluative information, formative versus sumiative 
evaluation* experimental designs versus surveys, etc. 

Next, you and your colleagues should specify more precisely the contents of the 
curriculum and the questionnaires. For exraple, you should specify the particular 
knowledge facts that should be covered* the specltlc attitudes that should be 
encouraged, and the specific skills that should be taught. The curriculum should 
include not only the speeitic topics that had been previously specified* but also 
the actual activities or processes that will teach the needed knowledge, attitudes, 

S5 

erJc 



jfa_j_Qy^_StiLRgs in Designing, and _Imgl_TOentiag a Prp^rm and_ lyjLlttation 



STAGE 1 



Spacity Cravise) Mjor program gomlfi and objectives 



STAGE 2 Design (revise) program 



STAGE 3 



STAGE 4 



I 



Prepare (revise) 
curriculum standards 



Create (revise) 
curriculum 



STAGE 5 Pilot test progrm 



STAGE 6 Implement progrm 



Design (revise) evaluation 



Prepare (revise) 
questionnaire specifications 



i 



Design (revise) 
questionnaires or other 
methods of collecting data 



Pretest evaluation methods 



Collect evaluation data 



STAGE 7 



Analyze data and reach conclusions about the progrM; 
return to Stage 1 



13 



28 



and skills. Your questionnaires should also be as complete as possiblej including 
the actual questions that you will pilot test. 



The next step is to pilot test both the curriculim and the evaluation methods. 
This is a very mportant step» for it will probably suggest innumerable improvements 
that should be made. After improving both program and evaluation methods, you 
should mplraent and evaluate the actual program. The results of your analysis can 
then lead to suggested improvements in your program if the evaluation finds 
weaknesses or to possible expansion of your program f the evaluation is very 
positive . 

This sequence can actually be considered a closed loop. Once you have 
recomendations for improvements or eK^pansions you should then revise and /or expand 
the program and reevaluate it* Evaluation, theuj should be a continuing process of 
clarifying goals, designing or improving the program, evaluating the program^ and 
improving it again. 



27 



14 



ERIC 



CHAPTOR 2 

IDEmfTIBG IMPOEXUiT f£4TUEESp OUTCOmS^ Am 0041^ OF FE^aAl^ 



frequently peopla avaluate programs by simply dascribiiig a program and 
assessing the nmaber of people that participate in various activities or components 
of the program. Although these descriptions are helpful^ they do not contain the 
critical part of an evaluation, namely, the assessment of the effects of the program 
upon the participants^ Espeeially in a sumative evaluation, you should carefully 
measure the consequences of the program as well as describe the processes taking 
place in the program. 

When evaiuators do assess the effects of programs^ they too frequently give 
insufficient thought to systematically determining their important goals and 
behavioral objectives of the program* For exmple, many assess the program^ s impact 
on knowledge s simply assuming that improved knowledge will lead to desired changes 
in attitudes, skills, decisionmaking, and behavior. Clearly, this assumption is not 
always true. 

If the goals of your program either explicitly or implicitly include goals 
about changes in attitudes, skills, or behavior, then you should measure the change 
in all of these. Failure to do so can substantially reduce the effectiveness of 
both the program and the evaluation^ If you fail to include an essential objective 
in your program when you design a program, then that program may be much less 
effective. Similarly, if you do not measure an important outcome of your program, 
then you are not fully evaluating your program. In sum, you should carefully 
specify and then evaluate the important goals and objectives of your program. 

Task 1 * _ Ejjtab 1 ish. Ma lor. Goals. 

The major goals of the program can be rather broad and you can choose several 
of them. Program planners and educators have used a variety of strategies to create 
an initial list of goals. They have 1) written down every idea arising out of group 
"brainstorming"; 2) studied other existing lists of goals; and 3) observed the 
contents of different programs and reflected upon the goals of the activities. 

Following are a list of some goals that have been adapted from Volume III of 
this report, S^xuajjut^y. Mucja_t_^^ Guide^ f pr_ Adolescents . These goals 

are only examples; you should consider these and others and create your own. 

• Students will have a greater understanding of their own values, their 
families' values, and their culture's values and will behave more 
consistently with those values. 

• Students will behave in ways that increase their own selt esteem and the 
self esteem of others. 



15 



28 



# students will use a systematic decisionmaking process to make important 
decisions about social and sexual behavior so that their behavior is 
consiitent with their values and goals. 

• Studenta will enhance their communication about sexuality and other topics 
with parents ) peers ^ and significant others. 

9 Students will enhance interpersonal relationships - 

# Students will avoid social and sexual activity that is unwanted or 
inconsistent with their values, 

• Students will have fewer unwanted pregnancies, 

9 Students will reduce the risk of getting and spreading sexually transmitted 
diseases • 



Task 2; Specify Behavioral Obiectives Leading to These Goals 

Consider each goals one at a time, and specify the important behaviors that 
facilitate that goal- Behavioral objectives differ from goals primarily in that 
they are substantially more specific. Thus, you will frequently have several 
objectives for each goal, although sometimes you may have only one. 

Each behavioral objective should be: 

• clear I if it is not clear, then it may confuse both the educators and the 
evaluators . 

m unidimens ional ; if the objective has more than one component, it should be 
broken up into more than one objective, 

• equally specific; some should not be general and others very specific* 

• reasonably achievable; otherwise there is little reason to entail the cost 
of trying to achieve and/or evaluate it - 

• measurable; if you cannot measure an objective, you may still wish to 
include it as an objective for your program, but there is little reason to 
include it in the evaluation* 



For example, given the goal above, "Students will have fewer unwanted 
pregnancies," two possible behavioral objectives are: 

• Some students will avoid unintended pregnancy by abstaining from sexual 
intercourse • 

• Students who are sexually active will avoid unintended pregnancy by using 
effective forma of birth control. 



Ta sk 3 : F or Each_ _B ehay i ora 1_ Qb j e c t i v e j_ JjiecAf y the Ne e_e s sar v 
Know l_ed&ej_ _A_t_t_i t_u d e_s ^ anA^k i 11 s 

Consider each behavioral objective, and then specify all the knowledge areas. 




EKLC 



attitudes I aod skills that are needed for each objeetive* (Hereafter, the knowledge 
areas, attitudeSj and skills will be ealled simply the KAB components.) Often you 
will fini! many different KAB components for each objective. You can use these as 
the basis for both the progrra curriculum and the evaluation questionnaires. 

Specifying the objectives is especially important in sexuality education 
evaluation. Sometimes you cannot directly measure the important behavioral goals 
and you can only measure the changes in knowledge, attitudes^ and skills that 
experts believe will lead to the behavioral goals« In such situations^ it is 
especially important to define precisely the knowledge components, attitudes, 
skills* and behaviors that are needed to reach the overall goals. 

The components should have the sme qualities as the behavioral objectives. 

That is, they should be clear* unidimensional * equally specific, achievable, and 
measurable. 

For example, following are KAB components for the second objective above, 
"Students who are sexually active vill have less unprotected intercourse by using 
effective forms of birth control," 



Knowledge Areas 
Students will knowi 

• the needs of children and the responsibilities and costs of parenthood 

• basic facts of reproduction and fertilisation 

• the important characteristics of the major effective methods of birth 
control (e.g.* the name, effectiveness, appropriate use, advantages and 
disadvantages* cost* source, and relevant laws) 

• the reasons teenagers fail to obtain and use an effective form of birth 
control 

• the consequences (phyaical, emotional* and social) of adolescent 
pregnancies , 

Attitudes 

Students will believe that: 

• they are probably capable of becoming pregnant or impregnating if they are 
sexually active 

• it is better to preverat an unwanted pregnancy than to have to deal with one 

• both sexual partners have responsibility for preventing pregnancy and both 
should take that r^i*sponaibility 

• it is important to discuss the possibility of pregnancy and the use of 
birth control with a partner before becoming sexually active. 

Skills 

Students will be able toi 

• refrain from having sex if they cannot use some effective form of birth 
control 

• obtain an effective method of birth control 

• use that method of birth control effectively - 



EKLC 



17 30 



Jtegs within Each Task 



Poliewing are eight steps that will help insure the comprehensiveness and 
selection of the important goalSj objeetivaSs and KAB components* You can use these 
steps with experts, members of your staffs community mamberSp potantial 
participants, and other appropriate people* 

Step li Generate a comprehensive list of goals (objectivesi KAS components)* 

Step 2i Continue giving this list to other experts until no additional goals 
(objectives 3 KAS components) are added. 

Step 3 1 Organize the list in a logical manner (e*g*3 group items on a similar topic 
together) • 

Step 4i Give the list to a panel of experts and have them rate each item on a 
numerical scale (e.g.i l^not at all important p Z^slightly important^ 
3-somewhat important j 4^very important , S^critlcal to the success of the 
program) • 

Step 5: Calculate the mean score for each ittm. 

Step 6: Send the panel of experts the following informationi which it^s received 
low ratings and should be excluded ^ which itms received mixed ratings and 
should be discussed , and which it^s received consistently high ratings. 

Step 7 1 Hold a meeting of the panel of experts; have them discuss each it^i give 
them the opportunity to rewrite and reorganize itms; and have them vote a 
second time on each item* 

Step 8 1 Calculate the mean score for each item; include only those items that are 
c 1 ear ly imp or t ant * 

If it is not possible to hold a meeting of the entire panel of experts, then 
hold a meeting of a smaller number of the experts* If this is not possible, then 
reorganize the items yourseltj and send them with the ratings to the panel for a 
second vote* 

In general, this entire process takes much longer than people typically 
estimate* For escample, carefully specifying the important goals, objectives, and 
KAS components for a comprehensive program may require months or even a year. The 
less comprehensive the goals, and the more centrally located the panel of experts, 
the more quickly the process can be completed* 

Although these steps are time consuming, they are worth the effort, because 
they will facilitate a much clearer and more precise specification of the goals, 
objectives, and KAS componentse Once these things are specified, writing the 
curriculum and questions for questionnaires is relatively easy* For exsmple, if the 
effectiveness of different forms of birth control has been specified as important, 
then you need to include that specific topic in the curriculum and you need to write 
a question that measures knowledge about contraceptive effectiveness* 



31 



18 



ERIC 



Task 4i Identify Unexpeeted or Undesired Effects 



The preceding diseussien has fQCUsed upon speeifying the important goals, 
objectives j and KAS components of programs- However j your program may sometimes 
have unexpected and negative effects^ and these should also be measured* For 
examplej critics of seKuality education have argued that seKuality education 
programs suggest new kinds of sexual activity to students j destroy students' 
morality by making them believe that premarital sex is acceptable, and increase 
students' sexual activity- Although people in this country vary greatly in their 
views of these possible outcomes ^ few programs have these outcomes as goals * and 
most programs would consider thra unexpected or undesired outcomes. 

Such unexpected or undesired outcomes should be measured for at least three 
reasons # First, your evaluation will be biased if you measure only possible desired 
consequences and ignore possible undesired consequences. Second^ the program staff 
need to know if programs do in fact have undesired consequences, so that they can 
remedy the problems. Third , if programs do not have such an impact, this should 
also be documented so that there is evidence with which to respond to criticism and 
concerns • 



19 



32 



QB^TEBl 3 



USUG BPraQ^X4L AND QUABI-SPmnmrUL DESICTB 



Chapter 1 compared the use of experimental designs and survey methods in the 
evaluation of sexuality education programs and argued that expermental designs 
provide much stronger evidence for the causal mpact of programs* This chapter will 
describe several different experimental and quasi^experimental designs* It will 
discuss the most rudimentary of experimental designs^ describe a major weakness of 
that design^ provide a solution^ describe another weakness^ provide another 
solution^ and continue until the design is adequate. This method of presentation is 
intended to demonstrate the rationale for each part of a good experimental design. 
The chapter will then describe other problems and other experimental designs that 
are useful in particular situations* 

This chapter uses the symbols and terminology employed by Stanley and Campbell 
in their popular book Experimental and Qua^i"" experimental Designs for Research ^ You 
should read that book if you need a fuller discussion of experimental designs* In 
the diagrams • 

• "X" represents the experimental treatment | in our cases it represents 
participation in a sexuality education course* 

• "0" represents some observation^ measurement, or testing; in our caset "O" 
represents the administration of questionnaires* 

• "R" represents the random assignment of people to the eKperimental group 
(those who participate in the program) and to the control group (those who 
do not participate in the program) * 

Each row of symbols represents a ditferent group of people* The left to right 
order of the R^s, X's, and O's indicates the temporal order of the random assignment 
to the experimental or control groups, the participation in the program, and the 
observations (administrations of the questionnaire) * Symbols in the same column 
indicate that those groups participate in the event (randomisation , participation in 
the program, or observations) at the s^e time* 

Throughout the following discussion of different experimental designs, this 
chapter will assume that the experimental treatment is participation in some type of 
sexuality education course and that the observations are some type of test* 
However, you should remember that the same principles apply to all types of 
experimental treatments and to all types of systematic observations (knowledge 
tests, other kinds of questionnaires, or other kinds of data such as pregnancy 
rates) * 

Que- Shot Case Study 

This is the most rudimentary of all experimental designs and perhaps it should 



21 



33 



EKLC 



ba ealled m pre--exper imental desigfl. It contains the tti^ttimal componeata of an 
expetime^t^ al desLgni participation of a group of pe^isple in an experimental 
treata#at (^m sexuality education clsSs) followed by one obs^srvation or poattast, 

^parinental Group K 0 

^empit^m the rudimentary nature of the design, educ^E.tori probably use thii 
deiign mos^m than any other design to evaluate their oours^ss. Itony edueators taaeh 
their itud^issts some material and then tiSt them at the endS. of that unit. If the 
students ji»rform well on the tests , tht teachers beli^g^i^e that the students have 
learned th^ material and that they ha^a taught the material ^ell. for exmples if a 
group of ^t^udents answer correctly 90 par cent of the questE^ons on a knowledge test* 
then the t#^3cher may feel the course isiffective* 

PyfifeJle. ^^ I Failure to measure ^cliinge . This design ;has several flaws* Its 
critical £Liaw is that it falls to toiaiure how much the sfc^dents actually learned* 
For Bnmpl0> ^ even if the students perComid well on the po attests the course may 
have beett t — otally ineffectivei the atudaats may have known, just as much before the 
course as ^f^ter the course. 

MsJ^ii^&^M* Administer tests befeifiand after the sesttia^lity education course in 
a One^Group Pretest Fosttest Desiga, 



One-Groups gr detest Posttest Design 

Is thi^^ design students compl%tii pretest before ^he course, then take a 
course, antf : finally complete a posttest after the course* 

Ixperimental Group OK 0 

By measuring z the difference between thcpratest scores and tKie postteet scores, the 
researehat' ^ can measure the change that teok place during tMie course* Consider the 
example below^»^i 

Experimental Group 161 X 90% 

In this eK^j^^le* the students answered eorfectly 70% of the ^^uestions on a knowledge 
test befor% the course, 90% after the courie* Thus, their t^st scores show a change 
or improvaiQ&att^t of (90Z ^ 701) or Ws suggests that course was effective, 

Coiisi4€acaE' a different examples 

Experimental Group M X 711 

In this ^%aji^ple the students answered corractly 70% of the qm^estions on the pretest 
and 71% of ^^*he questions on the poBttait. Their improvamente was only (71% --70%^ or 
1%, suggest lomg that the course did not affectively increase ^^nowledge# 

Probly^g Failure to Link chan ge ce the course . This qtaa si- experimental design 
also has a E»umber of major flaws* Ihi most critical fl^.w is that the measured 
change My ft^^t have been caused by tha mtBm but may have ©ecurrad anyway. For 
example* t^aenagers in high school art in a stage of rapldK change. Regardless of 
whether ©r #io^t they participate in a Sexuality education cla^Si they are likely to 
become uOif^^ interested in se3£uali.ty) to learn more a^ out sexuality ^ and to 
participants in various social and sexual behaviors for the first tine, 

34 

erJc 



Consequinlly i if faieare=hgri observe the change only in the itudentciwno take the 
class C thi iKperimental gr«*-©up)j they may incerrectly cenelude that tbiiBGreaee 
knovledgi was caused by the eourse when in fact it would have Oecurred anyway* 
Similarly^ they may iacOMi* actly conclude that a coufie cauied students to engage 
sexual activity J when ^h^ae students would have engaged in sesEUal aetivity anyway « 
Thus* the lack of a controZ 1 group is obviously a major problem. 

The Siriousness of oor t having a control group depends partly upOo the length of 
time betveea the pretest^ s.mnd the postteets and partly upon the oeeuninee of any 
special events during tba^ t elapsed time. For examples if a course isaaemester 
a year long, then studenti may learn a significant amount about sexuality and a few 
students may become *re seKualiy active regardless of wh#tb%i or not they 
participflta in a sexuality education course. For such an extended tiaii period^ © 
control group is needed* 

On the other hand^ If a course is short and the pretests and poi^tiits are only 
a couple of weeks apart or less, then a control group may or say not binieded. For 
example, if rasearcherg ^ are studying knowledge, it the pretests ao4 poittests ar© 
only two weeks apart ^ aad fctf the students did not participate in any ^pgdal evant^ 
then the researcherp e^n ^ probably conclude 1) that the students «ld not have 
increased their knowledge ^^core significantly if they had not participated in the 
coureei and 2) that th^ ^ course produced any signiticant impr©vw%fit between the 
pretest and posttest ico^a^s. 

If the researchers sere studying attitudes or behavior, however, and if the 
students participatad in f^mtttrnm after a major football game, at traded a seniot 
prOffis Oi spent Easta^ vi«»^ cation in some romantic spot^ then thes% other special 
events and not the sexuaJLity education course may have produced tlii changes in 
attitudes and behavioy between the pretests and the posttests* In general, 
researchers should always icsse a control group if special events, nciTBi^liiiaturatiQn , 
or any factor other thaft participation in the sexuality education Qcurse may have 
affected the students^ sc^^^es. 

Aol utip_B I To aveS'^0'«me these problems, researchers should use k taaquivalent 
Pretest Poittest Control C&eoup Design^ 



NQneaxi i valjnt.P re tAi t Control Group Design 

In this design botb the eKperimental and control groups complete the pretest* 
The axpeTittental group th^iLm participates in a sexuality education course, and after 
the coursi, both the expert^mental and control groups complete the post^eat. 

Experimental^ Group 0X0 
Control Gr^u^p 0 0 

The strength of this design^ is that it enables the researcher to compare the change 
in the expirimental group \rwith the change in the control group* 

To illustrate this^ eo^^nsider the exmple belowi 

Experiment #1 . Group 70% X 90% 
Control Gr^ur p 711 72% 



In this exaaple, the experi^^ental group increased its score by 20% whili the fiontrol 
group increased its score only 1%, This definitely indicates that chiiexuality 



adticatioQ class incrgA^e^d the knowledge of thi it\^d ents In the course. 



Gonaider & aeeosd i:Axwplei 

Expetimgii i^tal Group 75% X 9U% 
Control CrSroup 72% 8^% 

In this ex^pl^; tha ©^^perimental group iBcretied : itf percintaga oC correct answers 
by 161 1 and the control ^ group inereaeed iti seoreg Wby 15%, Beoause these Inor easaa 
are approKimately the ^^^mej some factot other tliai? ^^^he sexuality educatio^^ c-. ;3s was 
probably responsible £^f ^ the Increase In kiiowledi#i and the data indicate that the 
saKuaiity education cl^i^^s was not affactivei 

Finally ; consider fi m thirds soaawbat eOWQ e^^^nplei 

l^eriffi^fltmtal Group 75% X 91^ 
Control G^^roup 60% 63^S 

jrjg^ljmj^_ Jlssl milJ fc«_ c^^ ^nd ^ffi aiiftgjLjai_ ^^roupi , Because this Assign does 

not inelude any proeadut*^e for assuring that the t^o groups are equivalent before the 
coursij the two groups nmmMy differ# In the examplei the expirimental group Increased 
its parcentage of coir*»^ect answers by 16% and the s=ontrol group improved its scores 
by 31* This compari%Ot*n alone would inHutB tflfcmt the course was ef fective# 
However, the data indiq^at^te clearly that the ixpirisi^sntal and control grou^^ were not 
similar prior to the ^^acir^uality education cUss, partleular, the exp* «r imental 

group was already b#t€% €r informed than tti cOsttoB. group* This indicates that the 
studtnts who signed up f ^ ©r the sexuality e^usatiofl class were different from tha 
other students* The p^o^-grm may have been effictiir^e for thgmi but may not have been 
efftctivi for other st\id%.ants. Alternatively, the c^^ntrol group may have had some 
special qual3.ty that pfe*revented them f rofl learftifi^S a normal amount about sexuality 
f rom avaryday life. Ig so^ then they weie not 3Es adequate group wlt^ which to 
conpare the e^perimetittl group « 

The experimental Jtirad control groups oay dif f in other, unlmown way^s« In the 
examplij the diasimila^i^ ty between the contrDl and ^experimental groups im obvious; 
the pretesti indicate % that the groupi have si8#iS_f ieantly different scof ee on the 
measurad variables, BQ#^aver, even if the two gt^^aps have similar scoires on the 
measurad variables ^ t^^^y may neverthelesi differ ^^tibstantially on other mAnmeasured 
variables* Thus, even the pretest acotei are ete^ilar, the control grou_y may not 

be an adequate control ^^oroup. 

BoLut iqa #1 1 Oft^ way to assure that eHpe^rS-mental and control s^TOups are 
similar is to try to ft ind a control group that is as similar as possLfala to the 
expirimental group* F^f example^ one can use as m. ^antrol group another class in 
the high school which ^^es the s^e age dlitrlbyti.^QK.s the same grade level ^ the same 
level of eapabilitiea ^p^id intelligence^ a^d other ^jZ^ilar characteristics » 

finding such a at^^as may be diffi^uU, ba^^ittse students who decide to take a 
sexuality eduaation ^oxx^^mm may be diff eraiit fr^n tpo^me who don't. For e^ample^ they 
may be more liberal more sexually aetivi* If students going to college have a 

full schidule, they ma^ ^Aave greater difficulty fitting sexuality education into 
their sehidule, and thU0 sexuality coureei lay hdV# fewer college-abound students. 

Solution tZ t Att^tr:;her approach is to flat^li each student in the es^perimental 
group with a slmilsf #^£udent for a control gto^p* For eKraple, if one person wtio 
signed up for the saHuajiity education claii is v#r^y brightj mala« Blae^» and 15 

ERIC 



years old, them the ^eseareher would try to fiad another vsry bright, male, Blaek, 
i 5-year-old studeat who had not taken the CDutsa and add this person to the Gontrol 
group. Although the proeedure greatly improves the simil©nty of the expef imental 
and control groupSp implementing it may be time coasi^iDg and difficulti and the 
groups may still differ on ucmeaaured variables. 

Solution #3 1 The preferred method is to us^ a Randomised Pretest losttest 
Control Group Design. 



Raadom^z^d_ Pretest Post teat. Control Group Design 

This is the classieal experimental design* It is the me as the former design 
except that the students are randomly assigned to the ae3£\illity education class and 
the control group* 

^p er imen t a 1 Group R 0 X 0 
Control Group R 0 0 

The advantage of this design is that randomly aseig^Ltig itudents diminishes the 
differences between the experimental and control groups, %Bd thus the control group 
is an excellent group with which to compare the expertffleotfll iroup* 

Data is analyzed in the same manner as in the fotmer disign* 

P r 0 b 1 em j _ _ Impr ac t x calj. tjg ■ The major problOT with tKis dasign does not involve 
the validity or strength of the conclusions, becauaa tHii ii an eKcellent design. 
The major problem is a practical administrative problem* Rarely can researchers 
randomly determine whether any individual must take a #%3£yality education class or 
must participate in a control class. For both moral and political reasons, 
researchers should neither force students to take a senility education class nor 
restrain them from taking the class. 

Solution ; Use the Delayed Treatment Design* 



De 1 ay ed_ Treatment Pggi jjx 

In this design, students who sign up for a seXuaiity education course are 
randomly assigned to experimental and control groups* 



Phase 1 



Experimental Group R 0 
Control Group R 0 



0 
0 



Ihaii 2 



0? 



ly considering only those students who sign up for the eouriei we solve the dlleiyma 
of forcing students to take a sexuality education cl^ss. The students in the 
control group do not take the sexuality education courg© dunng Phase 1 of the study 
(to the left of the vertical line). During that phase tb^y sirve as a true control 
group by taking the pretest and posttest but not the coUrSe< 

Kien after the posttest and during Phase 2 (to the right of the vertical line), 
th# control students can take the sexuality education course. This solves the 
dllessoa of preventing them from taking the course. If desj.rsd| the researcher can 
administer a third questionnaire to the "control group" iftir they participate in 



25 



37 



EKLC 



thm CDurse, The third questionnaire ean serve as a pestteat to the second 
questionnaire whleh now serves as a pretest. That is, the centrol group can provide 
both control group data during Phase 1 and eKperimental group data during Phase 2, 

P r Q b 1 em i I^ag 4i a t e_ ve r s u s 1 o nRt gr m ef f _e c t s ■ All the designs discussed thus 
far measure the effects of the program at only one point in time, typically 
immediately after the program* This is a significant limitation because effects may 
change over time« Nimerous studies demonstrate that students cram for final esams, 
take the exams, and then quickly forget a substantial part of the material. Thus, a 
posttest completed immediately after a course may exaggerate the long term Impact of 
the course. 

On the other hand, some effects may not occur until months or years have 
passed « For example, a sexuality education course that stresses the importance of 
getting prompt medical attention if a person has any signs of se^aliy transmitted 
disease (STD) may not have any behavioral consequences until months or years later 
when some of the students get STP» Similarly, a course that emphasises using some 
effective form of birth control when sexually active cannot have any behavioral 
consequences until the students become sexually active. If a posttest questionnaire 
asking questions about these behaviors is administered iimediately after a course, 
it might indicate that the course had no behavioral effects, when in fact, the 
course would affect subsequent behavior « 

To overcome this limitation, a Pretest and Multiple Posttest Control Group 
Design should be used* 



Ire t j s_t_ and, jiul tip le_ Pos t fc eg t^ C on tr o 1 Gr ou j P e s iR n 

This design Is the same as the Randomxzed Pretest and Posttest Control Group 
Design except that it Includes a second posttest. This design provides the best and 
most valid data« 

Experimental Group R 0 X 0 0 
Control Group R 0 0 0 

Two examples will Illustrate the interpretation of the data. 

Experimental Group 6BZ X 90% 85% 
Control Group 69Z 70% 71% 

In this example, the two groups of students began with approximately the same 
knowledge. The sexuality education course increased the experimental students' 
knowledge substantially, (901 ^ 681) or 22%, and the control group remained about 
the same« Over a period of time, the students who took the course forgot some 
material. The increase dropped, (83% 681) to 17%, but they still knew 
substantially more than the control groups 

Experimental Group 70% X 90% 80% 
Control Group 71% 76% 79% 

Thm second set of data indicates that the sexuality education course temporarily 
increased experimental students' toowledge, but that the normal life experiences of 
the control group also gradually Increased their knowledge^ over a period of time, 
the two groups were again Indistinguishable (80% compared ta 79%^. Thus, the 
sexuality education class had no long term effect. 



38 



Problem I Admi nistrative difficulties. Most reiearchers have difficulty 
keeping eomtact with both the experimental and control groups over an extended 
period of time. Students move awayi graduate from high schoolj drop Out of f chool, 
or become lost from the sample for other reasons. Still others may remain in the 
smple but refuse or simply fail to return questionnaires after the end of the 
course. If a substantial percentage of respondents fail to complete one or more of 
the three questionnaires ^ then the mean scores may become very misleading and 
biased . 

Solution #1 g_ Hate h_ AnAi v idu a 1 jj'^ue s t ionna ir e a . A partial solution to this 

problem is to remove from the analysis all questionnaires for anyone who did not 
complete all three questionnaires. For example, if a person completed the pretest 
but neither posttestp then the pretest questionnaire would be excluded from 
analysis. Alternatively * if a person completed the pretest and only the first 
post test, those questionnaires could be included in the analysis of the short term 
effects, but they would be excluded in the analysis of the lorg term effects. 

This solution requires some method of linking the pretest questionnaire for 
each person with the corresponding posttest of each person. An obvious method is to 
ask people to put their names on the questionnaires. If there is no need for 
anonymity, this is a good solution. However, in many analyses of sexuality 
education, sensitive questions are asked, and anonymity must be assured. Thus, 
including names is not allowed. An alternative strategy is to ask each student to 
put on the paper some meaningful number that 

• is unique to that student 

• can be remembered easily by that student 

• is anonymous . 

If all the students have social security numbers, then they can use the last four or 
five numbers in their social security number. Alternatively, they can use the four 
digits representing the month and day (but not the year) of their birthday. This 
method works well with small groups. If there are many students, however, more than 
one student may have the same birthday ? but specifying the year in which they were 
born would reduce anonymity of either younger or older students. Yet another 
possible number is the last four or five digits of their phone numbers. In sums 
identification numbers must be chosen with care. 



This solution will prevent error caused by one group of people taking a pretest 
and different groups of people taking posttests. However, another problem may 
remain; those students who complete all the questionnaires may be significantly 
different from those students who fail to complete one or more of the 
questionnaires. If this occurs, your conclusions would apply to only those people 
who complete all the questionnaires and not to all the class participants. 

Solution_#2j _ CpjiAu_ct__a^ of Nonresjpndents . The best way to determine 

whether the course had a different impact upon those students who completed the 
evaluation and those who did not is to carefully track down some of the students who 
dropped out of the evaluation and to then compare them with the students who 
remained in the evaluation. If students dropped out because they found the course 
boring or because they became pregnant, then omitting them from the evaluation may 
significantly distort your results. On the other hand, if they dropped out because 
their parents moved but appear similar in other ways to the students In the 
evaluation, then omitting them from the evaluation may not bias your conclusions. 



39 



ERIC 



Frob lemi Testing Ef f aetg p When eourses ara short, taking a pretest may 
inerease the amQUnt students learn and may also improve their ability to oomplate 
the poattast. For a^amplei after students complete a knowledge test and listen to a 
lecture on the material covered in the test» they may r^ember the questions on the 
test and pay special attention to material that answers those questions. That is, 
they say seleetively learn the material that was covered oa the pretest. If this 
happens J and if the posttest is the same as the pretest, then the difference in 
seores between the pretest and the posttest may overstate the amount learned by the 
students. Similarly, if after completing a tests students ask other students about 
correct answers to some of the questions , then once again, the change in test scores 
may ovtrstate the benefits of the course. Finallyi if the pretests and posttests 
are administered only a short time apart and if they are the same tests, then on the 
posttest students may simply remmber their correct answers on the pretest and may 
be able to devote more time to the questions that were more difficult for thm. 

first Solution ; Use the losttest^Only Control Group Design. 

Poattest-Onlv Control, Group Design 

In this design the students are randomly assigned to the experimental and 
control groups I the experimental groups participates in the course; and then both 
groups complete questionnaires after the course. 

iKperimental fooup R X 0 
Control Group R 0 

This design eliminates the effects of pretesting by eliminating the pretests. This 
also reduces the total time required for test administration. If the randomization 
is completed carefully and if the smple sizes are large (e.g«, more than 100), then 
the experimental and control groups should be very similar to each other before the 
course begins, and any observed differences after the course should be caused by the 
course. Moreover, additional posttests can be atoinistered to measure long term 
ef f ecti. 

The major disadvantage of this design is that the sample size must be 
reasonably large and the randomization must be completed properly. If either of 
these conditions is not met, then the two groups may differ significantly prior to 
the course; this difference would not be measured; and it might be incorrectly 
attributed to the course. 

Se co^^n d B o 1 u t i on % Use different but equally difficult tests, using the 
Alternative Tests Design. 

Al t er nat i v e_ JCes j s_ Pesi^ 

In this design students are randomly divided into two different experimental 
groups and two different control groups. One experimental group and one control 
group get one questionnaire, O'' , as a pretest and a second questionnaire, 0"^"^, as a 
posttest. The remaining two groups get the two questionnaires in reverse order. 

Ixperiraental Group 
EKperimental Group 
Control Group 
Contro 1 Group 



R 0" X 0^^ 

R 0^^ X 0' 

R C 

R C 0' 

40 

28 



ERIC 



If possible, the two versions af the quest ionnaire should be equally diffiGult. 
Hwever, even if the two versions are not equally difficult, this design still works 
if students do not drop out and the same number of students take both versions as a 
pretest and the same number take both versions as a posttest. Then the scores for 
both experimental groups can be combined Cor averaged). Similarly, the scores of 
the two control groups can be combined (or averaged). 

Third Solution i Use the Solomon Four Group Design- 

JoAomon Four Giroup jesi^ii 

In this design one of the experimental groups and one of the control groups 
complete both the pretests and the posttests, while the other experimental and 
control groups eomplete only the posttests* 



Experimental Group R 0 X 0 

Experimental Group R X 0 

Control Group R 0 0 

Control Group R 0 



This design enables the researcher to measure directly the impact of the 
pretest contamination. The design is based upon the following principle." if 
students are carefully randomly assigned to groups, then the mean scores on the 
pretests would be the same for all four groups if all four groups had completed the 
questionnaires. If the posttest scores of Uhe experimental (or control) group with 
the pretest differ from the posttest scores of the experimental (or control) group 
without the pretest, then the pretest may have affected the posttest scores « 

Consider the following ex^plei 

Experimental Group 70% X 90% 

Experimental Group X 85% 

Control Group 70% 74% 

Control Grpup 70% 

This data suggests that the pretest increased the scores in the experimental group 
by about 5% and increased the scores in the control group by about 4% , 

Several researchers have used the Solomon Four Group Design to measure the 
impact of pretesting « Their preliminary conclusion is that pretesting has little 
impact if questionnaires are administered at least a week apart # If they are 
administered in the same day* pretesting may or may not have an impact, 

£r_Q b^Lem j Con t r o 1 group con t am i na 1 1 o n , In some schools or youth agencies, the 
students in the experimental group may interact with the students in the control 
group. If so, one group may "contaminate" the other group. For example, if a 
sexuality education class has emphasised the importance of using some form of birth 
controls and a member of the sexuality education class is sexually involved with a 
member of the control group, the impact of the sexuality education class may affect 
both peoples Alternatively, the lack of sexuality education for the control group 
member may affect both of them* In either case> error is introduced into the data. 
This error is particularly significant If the school or youth group is small , and 
friends discuss the contents of the course with one another* 



Solution g There is no design that overcomes this problem. Instead, the 



rsstarcheri need to select control groupg that have little interaction with the 
SMperimental groups. If control group contamination appears to be a substantial 
problei&i than the researchers should consider obtaining a control group from some 
other school or population of young people. This, of course, introduces new 
problemsj because random assignment is then difficult and the control group in a 
different school or population may not be similar to the experimental group • 

Problemi Inability to obtain. a_ control aroup . In many situations, obtaining 
an adequate control group is difficult, if not impossible. For exmplej everyone in 
a school may take a seKuality education progrM at the same time and consequently, 
no one is left to serve as a control group. Or more realistically , many people in a 
school may take a sexuality education program and those who fail to take it are not 
similar to those taking the course. ' 

Solution i Improve the Single Group Pretest Posttest Design by increasing the 
number of different pretests and posttists. 



Jime, _S_erJ.-eji_ Des ign 

This design lacks a control group, but it does include several pretests and 
posttests . 

Experimental Group 0000X0000 

In general, the larger the number of pretests and posttests, the more valid the 
conclusion. The additional pretests and posttests allow the researcher to establish 
a more solid basis before and after the sexuality education course, and therefore, 
to make a more conclusive claim about the effects of the course. Ideally, the time 
that elapses between the last pretest and the first posttest is similar to the time 
between the other pretests or posttests. 

Consider the following example: 

Experimental Group 70% 72X 71% 72% X 85% 84% 86% 84% 

Assuming equal time periods, the stability of the scores before the sexuality 
education course, the sudden increase in the scores during the course j i- the 
stability of the scores after the course strongly suggest that the course and not 
normal maturation processes produced the change. Of course, the possibility that 
some other major event affected the scores still remains 

Consider a second example i 

Experimental Group 65% 70% 74% X 90% 95% 

In this example, the effects are less clear. The scores increase both before and 
after the course, but they increase more during the course. This indicates that the 
course did have an effect. 

Finally, consider a third example i 

Experimental Group 65% 75% 66% X 77% 67% 80% 65% 

In this example, the scores vary so much from one test to the next that one cannot 

4S il 

ERIC 



eonelude that the eourse produced an increaie even though the iceref increased 111 
between the last pretest and the first poattest« 

If the Time Series Design is used with questionnaires, students may stop 
answering them carefully after the first few administrations. Moreover, the effects 
of testing may become substantial* 

The Time Series Design is most often used when pregnancy rates or other data 
are collected over time by some outside agency. For eKamples a school may implement 
a sexuality education progrm and afterwards obtain estimates from nearby clinics of 
the number of pregnancies in that school for each of several years before and after 
the program was impluented« 

Although this approach is sound in principle^ it has three problems in 
practice* First* the pregnancy rates for schools (or other groups of people) often 
vary considerably from one year to the next. Thus, it is difficult to distinguish 
between changes produced by the progrm and normal variations in pregnancy rates. 
Second, progrms are often implemented gradually over time. During the first couple 
of years s if only 10% of the school participates in the program, the pregnancy rate 
would be decreased by only 10% even if the program were perfectly successful in 
preventing pregnancies, and this small amount of decrease could be pbscured by the 
normal amount of change from year to year* Third, the time lag of a program must be 
estimated* Even if everyone participates in a program, the effects may not be 
immediate. Thus, once again, it will be more difficult to separate changes caused 
by the program from changes caused by other factors in the community. 



Summary 

The great advantage of experimental designs is that they increase the ability 
of the researchers to control i 

e the assignment of students to experimental and control groups 

• the design and content of the sexuality education course 

• the relative timing of the testing and the course. 

This, in turn, greatly increases the ability of the researchers to compare! 

• pretests and posttests 

• multiple posttests 

• experimental groups and control groups. 

Finally, this ability greatly increases the validity of statements about the causal 
impact of programs. Few other designs can provide such solid evidence about 
causality. 

You should select the best design that is feasible in your circumstances. At a 
minimum j you must administer pretests and posttests to an experimental group. If at 
all possible, you should have a control group. If you are measuring behavior, you 
should definitely try to measure long term effects with additional posttests. If 
you have a control group that is likely to be very similar to your experimental 
group, you probably do not need to worry as much about randomly assigning people to 
the experimental and control groups, although you should do so if possible* 
Similarly, you probably do not need to worry about pretest contamination and thus do 
not need to use a Solomon Four Group Design. 



COHDUCTIHQ SmVEYS 



When conducting a survey, researchers typically collect information from a 
sample of people. To do thisj they usually i 

• specify the problam 

• select the basic paramaters of the survey 

• identify the important variables to be measured 

• design interview schedules , questionnaires * or other methods of collecting 
data 

• select a sample 

• conduct personal or telephone interviews^ administer questionnaires or use 
other methods of collecting data 

• analyze the data statistically. 

This chapter will briefly discuss 1) different kinds of surveys, 2) their 
advantages and disadvantages over eKperimental designs, and 3) ways to use survey 
methods to evaluate sexuality education programs* Other chapters cover other 
important topics in survey reaearchi identifying important variables ^ designing 
questionnaires, selecting a sample, administering questionnaires, and analysing 
data • 

Although large surveys can be used to evaluate sexuality education^ this 
handbook is definitely not designed to prepare you to conduct a large survey. If 
you intend to conduct such a survey, you should read a textbook on survey research, 

T^peg of Surveys 

Metho dj_ oJ_ C^llectin^ Surva_v_ Data 

Pe^jonal interviews , Some of the best examples of survey research have used 
interviews based upon detailed interview schedules or questionnaires* During the 
interviews, the interviewer can make sure that all questions are understood and 
answered and can ask additional, more detailed or probing questions when necessary* 
When interviewers are properly trained, they are especially likely to obtain 
complete and valid data* However, interviews are less appropriate In evaluating 
sexuality education programs^ they are not anonymous and young people may be 
unwilling to be honest when answering sensitive questions about sexuality. 
Moreover, interviews are time consuming and costly* 

T_e_!Le ah o n_e_ In t e r v 1 ew s * It is normally much quicker, easier, and cheaper to 
interview people on the telephone than in person. However ^ telephone interviews do 
have several limitations i respondents are less likely to cooperate or to answer 
personal or sensitive questions, and the Interviews must be short. Telephone 
interviews are sometimes used to contact a random sample of the students'^ parents to 



ask them their views of the course and its effaets upon the students. 



Quest ionn aires ♦ Written questionnaires are probably the best method of 
collecting survey data that includes any potentially sensitive inf ormatioiip They 
are very eoinmonly used in both large and small surveys « 



Survey, J_es_i^_s 

One- shot surveys > In most surveys, information is collected at only one point 
in time* Thus, if you intend to use a survey to measure the impact of a program or 
reaction to a programs you should complete the survey after the program ends* To 
measure the impacts you should collect information from both participants and 
nonparticipants . 

Panel surveys . In panel surveys, information is collected from the same group 
of people at more than one point in time. This enables the researcher to measure 
change in the people over time. 



Survevi yersus Experimental Designs 

Random, Ass iMnment s 

In generals the distinguishing characteristic between an experimental design 
and a survey is that an experiment includes the random assignment of people to 
experimental and control groups and a survey does not. However, there are many 
kinds of experimental and quasi-experimental designs and many kinds of surveys and 
they can overlap. For example, one of the quasi-experimental designs discussed in 
the previous chapter did not include random assigmnent to experimental and control 
groups and did include giving questionnaires to experimental and control groups 
before and after the program. This is identical to a panel survey in which 
questionnaires are also given to experimental and control subjects before and after 
a program. In other words, as you take away some of the characteristics of true 
experiments and add to the characteristics of surveys , the two converge and what you 
label them makes relatively little difference. 

True experimental designs with random assignment can provide the most 
compelling evidence about the causal impact of sexuality education programs. This 
is a great advantage of experimental designs, but also a major burden. For example, 
to evaluate a specific program with an experimental design, you may need to obtain 
from the school authorities and the teachers involved their approval of the 
evaluation, the experimental designs and the random assignment. You also need the 
consent and cooperation of the students to be randomly assigned to the two groups 
and to complete one or more questionnaires. Finally s you may need permission from 
the parents to allow their young people to participate in the program and the 
experimental design. 

In contrasts to administer a survey, you simply need the cooperation of the 
respondents to complete one questionnaire and you may need the permission of the 
parents for their students to complete the questionnaires. Obviously, completing 
the survey is much easier and requires less control than the experimental design. 



45 

ERIC 



Smpling Plexibilitv 

A iecond major advantage of surveys is their much greater sampling flexibility* 
When using an experimental design, you must administer questionnaires to the group 
of people who participated in a program and preferably to an additional similar 
control group as well, ^^en conducting a survey , you can administer questionnaires 
to many different groups of people* although if your goal is to evaluate the impact 
of seKuality education progrms, you must include in your sample some people who 
have taken sexuality education and some who have not. 

The flexibility inherent in surveys is daBonstrated in previous research- ^* 
few studies have administered questionnaires to all students in a high sehroX MM 
compared those who had taken sexuality education with those who had not. c^iui^r 
studies have selected random samples of all teenagers in the United States as^ then 
compared those who had taken sexuality education with those who had not. Still 
others have taken surveys of program participants , parents of progrm participants, 
and mrabers of communities with new sexuality education programs. 

P^nonstr at i on _o£ Causal 1 1 v 

The major disadvantage of surveys is that they cannot fully control other 
relevant variables and thus cannot provide compelling evidence for the causal impact 
of sexuality education programs. For example , a survey of high school students 
might reveal that those students who have taken sexuality education have also been 
more sexually active. However , this conclusion would not necessarily result from 
sexuality education courses causing greater sexual activity. Rather, it could 
result from two other relationships: as high school students get older they are 1) 
more likely to take sexuality education and 2) more likely to have become sexually 
active. Thus, sexuality education would not have caused the sexual activity; 
rather, it would have been caused by the third factor, age. 

Although it may be possible to control for age statistically , other factors may 
produce spurious relationships that could not be controlled for. For example, 
students who choose to take sexuality education courses may be more predisposed to 
become sexually active or may be more sexually active even before taking the course 
than students who do not choose to take the course. Unlike experimental designs, 
surveys cannot control for such factors* 

Surveys that are being used to measure the impact of sexuality education 
programs in general have another major problem- Respondents may have participated 
in numerous different sexuality education progrmSj about which the researchers may 
have little Information, Including some questions in the questionnaire about the 
quality and comprehensiveness of the programs would still provide insufficient 
information about the programs. 

Uses In Pg^j^g^lgg. gpd Py^ Programs 
Although surveys have serious problms, they do have several important uses. 

Assessing Needs 

When educators design sexuality education courses, they frequently find it 
useful to determine what the potential participants need by conducting needs 

46 




aBsessments* This entails sending a sample of potential paftieipants a 
questionnaire with the following kinds of questions; 

n questions about the baekground of the participants 

9 lists of topics for participants to rate according to their importance to 
the participants 

© open ended questions requesting additional topics that should be covered* 

Sometimes educators demonstrate need by designing a knowledge test with questions 
about facts that young people need to know, administering it to young people, and 
then demonstrating that most of the respondents missed many questions* 

Assessing Comuni:t^__Op_inioa 

Unfortunately, sexuality education programs are often controversial, and a 
small number of vocal people may gave a misleading mpression that there is a lot of 
support or opposition to a program* To accurately determine the actual amount of 
community support or opposition to programs, some educators have surveyed the 
community « 

When implementing such a survey, you should be sure to send questionnaires to 
either all parents (or members of the coraunity) or to a random sample of the 
parents Cor members of the community) . You should also try to get a high response 
rate so that the survey data accurately reflect community opinion* If only those 
people who strongly favor or strongly oppose seiniality education complete and return 
the questionnaire, then the data will be biased. Response rates of 80 or 90 percent 
are excellent, but unusual* Researchers can normally obtain rates of 50 or 60 
percent at most* If your response rates are lower than 80 or 90 percent ^ you should 
try to determine how the respondents differed from the nonrespondents * 

When assessing comunity opinion^ you may find it useful to provide a list of 
topics that might be covered in a course and then ask parents (or coimunlty members) 
to indicate which topics should be covered and in which grades (5th grade, 6th 
grade ^ etc.) they should be covered. If you are sending questionnaires to members 
of the comunity in general ^ you may want to include a question asking whether they 
have children in the school • 



As_g e^a i_ng^ Jrogdi a t e jRe ac t^_on_ to, Jrogr m s 

As discussed in the first chapter of this volume, formative evaluation provides 
immediate feedback to the instructors about participants' reaction to the 
instruction* Such feedback is often very important, because it allows instructors 
to iraediately change or modify the course even before it has been completed* It is 
also an important addition to verbal feedback, because it is anonymous* 

An important way to obtain feedback from students is to administer 
questionnaires at frequent intervals throughout the course* These questionnaires 
can contain the following kinds of questions i 

• lists of different topics with instructions to rate the value of each topic 

• open ended questions asking students to suggest additional topics 

• open ended questions asking students to suggest any kind of change or 
improvement 



47 

36 



• ahsraeteristiei of the teaQher with instructions to rate tht teacher on 
eaeh dimension 

• oharacterist les of the classroom enviromaent with initructions to rate the 
elasEroDm on each dimension 

• questloxiB about how the students feel the eourse has or will affect them* 

Questionnaires can be administered in class ^ distributed in class as homework , 
distributed through other organisations, or mailed. However * as many teachers know, 
if the questionnaires are not completed and collected in class, many people will not 
return them* 

^miples mre found in the appendix • 



Asjtes s i nR_ Pay en t and G ommun it y_ Rea c t ion to jrORrm s 

Because parents have the opportunity to observe their children's behavior off 
campus^ they can provide an independent view of the impact of the program* One 
approach is to send questionnaires to parents at the end of the progrm, asking them 
about their reactions to the program. Such questionnaires should ask for the 
following kinds of informationi 

• amount of parental involv^ent in the course 
d amount of parental knowledge about the course 

9 parents' perception of the effects of the course upon their children 

• degree to which the parents support the program 

• parents' suggestions for any changes in the program (open ended questions) • 
Once again there are examples in the appendix* 

Assessing the Effects of the Program 

There are two ways to use surveys to assess the effects of progrMS* The first 
is to administer questionnaires to people who have taken one or more programs and 
also to people who have taken no sexuality education programs. This approach has 
the disadvantages discussed above, especially the researcher's inability to control 
relevant variables, and the lack of detailed information about the programs taken. 

The second way is to administer questionnaires to students themselves and ask 
them for their perception, namely^ how they believe the course has or will affect 
them* This method can provide some useful supplementary information. However, by 
itself, it is not a valid method for several reasons, first, if students like the 
program, they will claim that it had very positive effects even if it did not. 
Second, the method relies upon the students' subjective perceptions, which may be 
incorrect. They may believe the course will change their behavior , when in fact 
their behavior will not change or would change anyway* 

Co n c 1 u s ion 

In sum, surveys can elicit useful information from different people during the 
planning and revision of progrms* They cannot provide the compelling data that can 
be provided by experimental designs for the causal impact of programs, but they can 
provide very useful supplraentary information. 




EKLC 



This handbook davotas eonaldarable space to the design and use of 
questionnaires to evaluate sexuality edueation programs* There are three reasons 
for this. Firsts most evaluations of sexuality education programs have used 
questionnaires* Second^ anonymous questionnaires usually provide the best method of 
collecting new data on the numerous possible outcomes of sexuality education 
programs* Third, other methods may not be feasible. For example, although a 
progrMi may attempt to influence students' social and sexual behaviors, you cannot 
directly measure those behaviors outside of the classroom* Similarly, interviewing 
is not only very time consuming, but sacrifices the respondents' anonymity and 
probably the validity of the data as well. In sum, anonymous questionnaires are 
usually the most practical way to obtain reliable and valid data on the effects of 
sexuality education. 

Different designs require different kinds of questions* for example, if you 
want to measure the impact of the program upon the participants and are giving 
questionnaires to the students only at the end of the progr&a, the questions must 
ask the participants how they believe the progrm has and/or will affect them. In 
contrast, if you are giving questionnaires to the participants both before and after 
the program^ the questions should measure their knowledge, attitudes, or behaviors 
so that you can compare their pretest scores with their posttest scores* Thus^ 
different designs require different kinds of questions. 

Chapter 6 discusses questionnaires that measure participants' assessments of 
the program, and that are best administered only at the end of the program* 
Chapters 7^ 8* and 9 discuss questionnaires that measure knowledge, attitudes, and 
behavior, and that can best be administered both before and after the program. 

Import an_t^ St ep s in, Pes_iEn 1 ng Qu e s t i onna 1 r es 

To develop reliable, validg sensitive, and appropriate questionnaires, you 
should complete the following steps i 

1 * ^ t e jjaiiie. _t Jve f ea tur^ s_ and, out come s to b^_ m^eas ur ed * 

• Consider state guidelines, community standards ^ Md sensitivities of the 
students. 

• Specify the number of questions needed to measure each feature and outcome. 
2 . Cojist r_u c t th e_ jgua s t i onna ire . 

• Review other questionnaires and select questions. 

m Select the best formats for the questionnaire (multiple choice questions, 
Likert scales^ etc.). 

• Write or rewrite the questions* 



m Ofgamize the quest iotinaire* 

• Review the queetionnaire several days later. 

• Review the questionnaire for sensitivity. 

3 • Pr et est _th e_ a ue s t j. onna ire . 

m Have other experts review the questionnaire. 
m Pretest the questionnaire with a few students* 
m Review the the overall distributions of answers. 

• Pretest the questionnaire with a larger group of students, 

• Analyse the responses to each question^ and modify the questions if 
neeessary * 

4- A_%SA8 s_j e IjLaUbil i ty^ jgLA_yjLl idi ty . 

m Assess the reliability and validity of each quest ion > and remove or modify 
those questions with poor reliability and validity, 

• Hake sure the final distribution of questions and answers Is aceeptable. 

• Assess the reliability of the entire questionnaire, 

• Assess the validity of the entire questionnaire. 

Determining^ the Important Features__and_ Quteomes That Shpuj._d_ Be Measured 

Chapter 2 strongly encourages you to develop the currleulum speelf leations and 
the test specifications simultaneously and to develop both of them before 
Implementing the program- If you follow that plan^ your program and your evaluation 
will probably be consistent. 

Unfortunately^ many program providers follow a different developmental 
sequence; they gradually develop and modify a program, then decide to evaluate it* 
and then specify its goals and objectives in order to conduct the evaluation. In 
doing this* they risk specifying new and unrealistic or unfair goals * and basing the 
evaluation upon these goals rather than upon the original goals of the progm. 
Then they lack congruence between the actual objectives of the program and the 
objectives used in the evaluation^ If, for exmple, knowledge tests measure facts 
or content areas that are not ©Bphaslzed in the classroom^ the questionnaires may 
underestimate the amount that the students learned. It is essential to ensure that 
program and evaluation objectives are consistent. 

Cpj^idering L^ay_s_^ KegulatL and Community Values 

When you are deciding which features and outcomes of the program you wish to 
measure and evaluate, you need to consider any regulations pertaining to the 
administration of questionnaires to students. Although all states allow teachers to 
administer knowledge testa about sexuality, several states and many school districts 
have regulations governing the administration of questionnaires about sexual 
attitudes and behavior. Only a few districts forbid such questionnaires* but many 
place restrictions I they may require that parents have the opportunity to see the 
questionnaires, that parental permission be obtained* or that school boards approve 
the questionnaires. You should be certain to meet these regulations. If you fail 
to do so* your evaluation and your program may be endangered. 

You should also consider the sensitivities of the students and the values of 
the comLunlty. There may be some outcomes you should not try to Pleasure because 



50 

40 



either your students or your community would find the questions offensive or overly 
personal* 



Sp^_cif^JjLg^_tliji_ Njttmber of QafeStioafl_^Q_A8_k_ in E^^ Area 

After making final deeisions about what features and outeomas you do wish to 
evaluate and seasuret you often need to deeide approximately how many questions you 
will need in order to measure each feature or outcome, for exraples if you have 
specified several different knowledge topics that should be covered ^ you then need 
to decide how many questions you should ask in each area* Similarly, if you wish to 
measure self estem, you need to estimate the number of questions that you will need 
in a self estesn scale. Approximate numbers of questions needed for different kinds 
of scales are discussed in Chapters 6-9. 

Por the first draft, develop a larger number of questions than you ultimately 
wish to ask in each area. During the pretesting some of the questions will probably 
be unreliable, invalid, or unusable for other reasons and will be deleted. 

Sometimes you may want to measure many program outcomes and may need to ask 
more questions than you can reasonably expect students to answer. If you have a 
sufficiently large sample of participants, then you can use matrix sampling, a 
technique that allows you to ask a greater number of questions and thereby measure 
more of the salient outcomes without making any single questionnaire too long and 
overburdening the participants « To use this technique, divide all the questions you 
wish to ask into several different questionnaires. Then randomly divide your 
respondents into the number of groups that you have questionnaires. If the groups 
are completing both a pretest and a posttest, be sure that they complete the same 
questionnaire on both. 



Constructing. the_ ^astionnaire 
Rev i ew ing 0 1 h^r_ ^ue s t i onna i re s 

Before designing your own questions and questionnaire, you should review 
existing questionnaires. They can provide a useful pool of questions, some of which 
may be appropriate for your evaluation and may also be reliable and valid. 

• Some questions will stimulate ideas about additional questions; 

# Some can be modified to meet your own particular needs; 
0 Some can be used as they exist. 

However, when reviewing questions and questionnaires created by others, you 
should remain cautious for several reasons. First, those questionnaires may not 
have been properly de8igned« Second, questions which are valid for another 
population may not be valid for your population because your program participants 
may differ in age, in educational level, or in some other fundraental way. Or your 
program may differ; it may have CTphasised different facts, attitudes, or behaviors. 
It ii easy to use questions created by others, but not always appropriate. 

A few questionnaires that have been carefully developed for the evaluation of 
sexuality education programs are listed at the end of the appropriate chapters. In 
the Appendix are questionnsires that we have used in our evaluation of sexuality 
education programs. 




Writing and Organizing the Questions 



Writing good questions is mn art at which most people improve with praetice. 
However, the following guidelines can help you avoid conmon errors* 

m Use a vooabulary that all the respondents can understand. 
m Use simple sentenee structure. 

• Hake questions as clear as possible. 

• Avoid double negatives. 

• Avoid any trick or misleading questions. 

• Make questions (especially attitude and behavior questions), 
unidimensional I in any question^ focus upon only one topic, attitude ^ or 
behavior « 

Poors How often do you talk about sexuality with your friends and parents? 

Betters How often do you talk about seKuality with your friends? 

How often do you talk about se^ality with your parents? 

• Make sure questions are appropriately open-ended or closed-ended. Closed 
ended questions provide all possible answers and ask the respondent to 
select the best of the answers (e.g, "What is your sex? Hale 

Female") . Open-ended questions provide a blank space and ask the 
respondent to write the answer (e.g.j 'lleseribe something you liked about 
this course.") 

Organizing an entire questionnaire is also an art guided by these basic 
principles s 

• Provide clear, complete, and concise directions. 

• Arrange questions so that they are easy to read (e.g.j put each choice of a 
multiple choice question on a different line) . 

0 Organize questions by format (e.g., all multiple choice questions 
together) . 

m Put easier questions first and more difficult questions last. 

• Put closed^-ended questions first and open-ended questions last. 

Keep the entire questionnaire short. If at all possiblej a person should be 
able to complete it within 20 minutes. Questionnaires for elementary school 
students should take less time; questionnaires for college students can take more 
time. If necessary, administer different portions of the questionnaire on different 
days or use matrix sampling. 

Analyze the responses to be sure correct responses form a random pattern. 
Avoid several consecutive questions with the same answer. For example, in a 
knowledge test you should have no more than three consecutive multiple choice 
qnestions with the correct answer "a." Similarly, in attitude scales you should 
avoid strings of items to which the respondent is likely to "Strongly Agree" or 
"Strongly Disagree. " If you discover such response sets, you should change the 
order of questions or change the correct answers to questions. 

Finally, avoid having a disproportionate number of questions with the same 
answer. If you have a multiple choice test with five possible answers for each 
question, then about 20% of the questions should have "a" as the correct answer, 
aboni 20% should have "b", etc. Having as many as 40% with the sme correct answer 
may affect the respondents'' answers and may adversely affect the validity of the 

52 

■ ■■■■■■ . 42 

O 

ERIC 



questionnaira* 



Revigwine your dr af t > After you have prepared a draft Of the queitionnaire, 
set it aiida for several days^ and then review it. Normally you will find that your 
fresh look will give you new insights and enable you to make numerous mprovements. 

At this time, you should also review the questionnaire with the sensitivity and 
values of the students and community in mind. Modify or remove any questions that 
might % 

• offend the students 

0 offend mCTibers of the eomDunity 

# inadvertently tea^h students ineorreet information or promote improper 
values and behavior. 

In addition, if you intend to ask questions about personal behaviors that^ if 
revealed, could harm the student Sj you should either not ask the questions or take 
all reasonable precautions to ensure that the information will rmain anonymous. 

Pr e t es tija^ t he_ Ques t ionnair e 

Ivaluators have commented that the three most important tasks in writing a 
questionnaire are pretesting, pretesting, and pretesting* Certainly pretesting is a 
very important task that is not always fully completed* Our experience clearly 
demonstrates that pretesting questionnaires uncovers confusion or other problems 
that we could not have anticipated and thus greatly improves the final versions of 
the questionnaires* 

Usin^ Experts 

When writing questions, have other sexuality educators or knowledgeable 
professionals complete the questionnaire, examine it for clarity, suggest 
improvments, and if it is a knowledge test, verify the correct answers* 

Using a Small Group of Program Participants 

In addition^ administer the questionnaire to about 5-10 participants with 
different skills and backgrounds* 

• Observe any problems that they have while taking the questionnaire* 

• Heasure the time they need to complete it. 

• Discuss each question on the questionnaire one at a time with them. 

During these discussions^ you should ask them how they interpreted each 
question and whether the questions contained needlessly difficult words, were clears 
or were to© personal or embarrassing. Ask if there were any other sources of 
confusion and how the questions could be improved* Such a process^ if done 
properly, can elicit numerous suggestions for improving the questionnaires* 

PretestinE the Quegtionnaire with a Larger Group of Students 

At this point » it is important to administer the questionnaire to at least 30 

53 

43 




ERIC 



studetits. During the admiGistration, you should observe any problems that the 
students have with the questionnaire. Then score the anewers and conduet an item 
analysis. 

Item analysis involves the exwination of the responses to different questions. 
Because it is particularly important in constructing knowledge tests and attitude 
scales s this handbook discusses item analysis in those chapters. Howeverj an item 
analysis is often useful with other types of questions to determine whether 
questions were too difficult or were misunderstood^ and whether questions should be 
made more or less extreme. Be especially careful when using criterion reference 
methods. Removing items or makiag them more difficult or easy may destroy your 
criteria. For eKmple^ if you removed several i\:^s because they were difficult and 
most people missed themp you might erroneously conclude that students who 
subsequently took the test had a sufficient grasp of the material. 

If you are creating scales, you may wish to use factor analysis or other 
statistical methods to select or improve the questions in the scales. This requires 
a sample size of at least 50 and is fully discussed in Chapter 8- 

AssessinR Reliability 

Reliability is the consistency, replicab ility , or reproducibility of a 
question, scale^ or entire questionnaire. A reliability test measures how well you 
are measuring whatever you are measuring; it does not measure how well you are 
measuring what you think you are measuring. 

To realise the importance of reliability, consider as an example an ordinary 
bathroom scala that measures your weight. If you stood on it five times in quick 
succession^ and each time it gave you very different weights ^ you would consider the 
scale unreliable. On the other hand, if it gave the same weight each time* you 
would probably consider it reliable, although not necessarily accurate or valid. 
Similarly, if you gave a group of students who were not in a related course a 
knowledge test several times during a short period of time and they received widely 
different scores each time, it would not be reliable. If the students gave similar 
answers each time* it would be reliable. More generally * if a questionnaire fails 
to provide reproducible results, then it is not reliable and probably not useful. 
If it does produce reproducible and consistent results, then it is reliable and may 
be useful. 

The reliability of a questionnaire can be measured in several different ways. 
Seme methods are better for particular kinds of questionnaires than others. 

Test-Retest Reliability 

In the test^retest method, as suggested by the n^e, the researcher gives the 
same people the sam^ questionnaire twice and compares the results of each 
administration. Two principles guide the time interval between the two 
administrations. Firsti if respondents can remember their answers from the first 
administration and simply repeat them during the second, the second administration 
would not be an independent measure of the phenomenon being measured. Therefore, 
the two administrations must be sufficiently far apart. Second, if the phenomenon 
being measured changes between the two administrations, the results would change 
even if the test was reliable. Thus, the two administrations should be sufficiently 
close together that the phenomenon being measured has not actually changed. for 



54 



maay questionnairei p a duration of 2 weeki between the two administrations is a 
ifeasonable solution to these two eonflioting prineiples. However, if the 
queetionnaire is a knowledge test, and if the students are covering in elass the 
material in the test, then 2 weeks is obviously too long* You should either shorten 
the time interval, or give both the test and retest to a different group of students 
not taking the course • 

If you have a calculator with correlation or a small computer, an easy way to 
compare the scores from the first administration with those of the second is to 
calculate the correlation coefficient between the two administrations » If students 
who scored poorly on the first administration also scored poorly on the second, and 
students who scored well on the first also scored well on the second^ the 
correlation coefficient will be high, and the test Is reliable. Correlations in the 
high .80 'a and .90 'a represent high reliability | correlations in the *70's and low 
»80's represent adequate or fair reliability; and correlations below .70 reflect 
poor reliability. 

If you do not have a calculator with correlation or a small computer, then you 
should visually compare each individual's pretest and retest scores. If the scores 
appear similar, then the questionnaire is probably reliable. Of course. Inspecting 
visually is much leas precise than calculating a correlation coefficient. 

Split-Half Method 

In the split-half method, the researcher measures the same underlying 
phenomenon twice, but this time measures it during the same administration with 
slightly different questions. Thus, the questionnaire would contain two parts, each 
of which would measure all the content areas. In a knowledge test, for example, 
each content area could have two similar questions, one In each half- These pai'^rs 
of questions should be so designed that if particular students knew the answer to 
one, they would probably also know the answer to the other. Then, Instead of 
comparing the two administrations of the same test, you would compare the two 
different halves of the test^ AgaiUi an easy way to make this CQmparison is to 
calculate the correlation coefficient between the two halves. 



Mu 1 t ipl e-_I t em He t hod 

The multiple-item method is actually an extension of the principles Involved in 
the split-half method. Instead of containing only two similar questions, the 
questionnaire would contain several questions measuring the same thing. For 
exMiple, when measuring attitude toward premarital Intercourse, a researcher could 
create a five-item scale with each Item or question measuring the student's 
attitude, then compare all five responses from each student. Although it is not 
possible to calculate a single correlation coefficient among three or mora 
questions, an excellent statistic called Cronbach's alpha does summarize the extent 
to which all the questions presumed to measure the same phenomenon are interrelated. 
Cronbach's alpha can be Interpreted in the same way as a correlation coefficient to 
establish reliability and can be found in standard statistical packages such as the 
Statistical Package for thS Social Sciences. 

Re 11 ab 1 1 i t v and__Cr i t er i Qn_ R ef er enc e d He a s ur e s 

Some criterion referenced measures may have little variation in the answers. 



er|c 



45 



55 



That is, many people may answer the question identically. When this oeeuri, 
test-retest correlations or multi-item reliability coaffieiente will be low even 
when the itma are reliable* Thus, eostrary to the disoussion above, when the 
variation is low, you should not conclude that the item is not reliablei even if the 
correlations are low* On the other hand, if the correlations are high, you ean 
still conclude that the Ltmk is reliable. 

Assess in^ _Val j.d_i_t^ 

Validity measures how well you are measuring what you want to be measuring. 
This is in contrast to reliability which measures how well you are consistently 
measuring whatever you are measuring. Return for a moment to the example of the 
bathroom scale* If the bathroom scale is actually a thermometer in disguise and is 
measuring the room temperature, then it may be reliable because it consistently 
measures the same tmperature, but it is not a valid measure of weight because it is 
not measuring weight. More generally, questionnaires may be reliable but not valid. 

By definition, you want your questionnaires to measure the phenomena that you 
designed them to measure* Thus, the validity of questionnaires is very important. 
When collecting evidence to demonstrate that you are in fact measuring what you want 
to measure, you may find one or more kinds of validity* 

Face Val^idity 

face validity* the simplest and most direct kind of validity, measures the 
extent to which a question obviously or "on the face" measures the desired concept. 
The examples below have high face validity* because people will probably interpret 
the questions as they are Intended and will answer them honestly* 

What is your sex?- __ ___ PTOiale ^_ _ __ Male 

In what month were you born? 

How often do you talk with your parents about methods of birth control? 

In contrast* the next examples have lower face validity, because they may be 
misunderstood, may be difficult to answer* or may be answered dishonestly. 

How many times each year do you have sexual intercourse? _ 

How effective was the progrM? Very effective 

. . . Somewhat effective 
Not at all effective 

Some young people may not know the meaning of sexual intercourse* may not remember 
how many times they had intercourse, or m'ay not be willing to answer the question 
honestly if they think others may see their answers. Hence* their answers may be 
invalid. The second question may be invalid because people who like their 
instructors tend to overrate the effectiveness of programs. 

If you are including questions of a slightly more technical nature (e*g** self 
esteem items)* you can ask an Independent professional such as a psychologist to 
assess whether the items measure the desired concept* 



56 

46 



Bteauie f see validity ean be overusad and is oot scientific, many texts rejeet 
it as a aerigus type of validity. You should claim your quastions have faee 
validity only if you are certain of their face validity and have no alternatives. 

Content Validity 

Content validity involves the extent to which questions proportionately cover a 
specified domain. To dCTonstrate content validity, you must dmonstrate 

• that experts in the field have selected certain facts, attitudes^ or 
behaviors that should be measured, and 

• that you are actually including questions on those particular facts, 
attitudes, or behaviors. 

It may be the most eomion type of validity that you will use, especially if you 
are using criterion referenced methods. If you have followed the procedures for 
developing criterion referenced methods and specified the important goals, 
behavioral objectives, and necessary facts j attitudes, and skills, you should be 
able to demonstrate a reasonably high content validity. 

Content validity is most appropriate for knowledge tests and less appropriate 
for questionnaires where different respondents may interpret items differently, 
misunderstand items, or refuse to answer them. 

C_r_it_e_r i on^Re Ijtt ejl^a Li d 

Cr iter ion^related validity measures the extent to which you are measuring a 
desired construct by independently collecting valid data on the respondants and then 
comparing that data with their responses on the questionnaires. Because this method 
involves verifying questions against known and valid data, it is probably the best 
type of evidence for validity. (Note that criterion related validity is different 
from criterion referenced methods. Although they have a common theoretical 
underpinning, they are distinct and should not be confused.) 

Criterion-related validity is difficult to Implement in sexuality research. 
Sometimes you can compare results from your questionnaire with results from another 
questionnaire that is known to be valid. However, other than knowledge tests, few 
questionnaires on sexuality have been established as valid for young people. Or you 
might administer a questionnaire about use of birth control methods to young peopl.3 
in a clinic, and then compare answers on the questionnaire to their clinic records. 
This comparison could serve to validate the questions, and then you could use the 
questionnaire independent with another similar population. However^ such 
opportunities are relatively rare, 

Pr e_d xcti ve V_a Li d it v 

The difference between predictive and criterion^related validity is simply a 
matter of timing. In criterion^related validity you compare your questionnaire 
results with data collected previously or simultaneously from a different and valid 
source. In predictive validity you use your questionnaire data to predict 
subsequent behavior and then compare your predictions with that behavior. 

Because of this similarity, predictive validity has the same advantages and 




ERIC 



disadvantages a§ air iter lon^related validity; it is a compelling method but very 
diffieult to uiplement. 



Construct Validity 

Construet validity also res^blei criterion-related validity to a considerable 
extent* In criterion^related validity ^ both your questionnaire and the independent 
source must measure the same concept. In construct validity » they measure two 
different concepts that are theoretically related^ That is, you hypothesize that 
your questionnaire data will have certain kinds of relationships with other 
constructs. If your hypotheses are supported by the data^ then you can have greater 
faith in both your hypotheses and the validity of your questionnaires* If your 
questionnaire data do not support your hypotheses , then either your hypotheses are 
incorrect or your questionnaires are not valid. 

Validity checks may be performed in the following circiUBStances i 

• The two constructs may be correlated with one another. For example^ you 
might compare or correlate test scores on your sexuality knowledge test 
with scores on a general intelligence test* Because they are measuring 
different content areas i they should not compare or correlate perfectly. 
However, because they both measure aspects of knowledge, they theoretically 
should be somewhat related* 

• Different groups may be expected to perform differently on your 
questionnaire* For eKamplep you might hypothesise that frestMen would have 
less knowledge about sexual activity and be less sexually active than 
seniors. If your questionnaire measuring knowledge and activity supports 
this hypothesis p then you have some evidence for the validity of your 
questionnaire. On the other hand, if your questionnaire did not produce 
the expected results, then either your hypothesis is incorrect » or your 
questionnaire is invalid* 

• Your experimental group should perform differently on the pretests and the 
posttests. If your questionnaire finds these results, then you have some 
evidence for the validity of the questionnaire. If you find no change, 
then either the program did not perform as you hypothesized, or your 
questionnaires were invalid* 

In order for construct validity to be compelling^ your hypotheses should be 
supported by established theory or data* That is, if others do not find your 
hypotheses compelling , they will not find this evidence for validity compelling. 
Moreover, you should not generate one hypothesised findings learn that it was not 
supported by the data, create a new hypothesis consistent with the data» and then 
claim this as evidence for construct validity* The established hypothesis must come 
first and your test second. 

Va 1 id 1 1^ jmd_ Or 1 1 e_r ijm^Rej^r en^^ HejLS ur^_s 

If you measure er iterion-^related validity^ predictive validityj or construct 
validity, your assessment may be based upon correlation coefficients or other 
statistics that need considerable variation* However ^ if you are using criterion 
referenced questionnaires and have little variation in your data» then your 

. 58 

ERIC 



correlations may show that the questlonn&irea are not valid» when in fact they are. 
The reverse is not true — if your correlations indicate that questionnaires are 
valid* then they are. This is the same problem that also occurs when measuring 
reliability. 



Generally * you will find assessing reliability far easier than assessing most 
kinds of validity. If posslblei you should try to obtain evidence for validity » 
You may, however , choose to do as many othere have done and carefully follow the 
major guidelines for creating reliable and valid questionnaires and then measure 
their reliability but not their validity. The ntmerous procedures described In this 
chapter* especially those on pretesting, will definitely help produce reliable and 
valid questionnaires. 



Conclusion 



*9 59 




This ahmpter discueeei issues in desigaing questionnaires that ask particip^^cg 
to assess bo^h the eharacterlsties of the course and its effects upon theaselv^^Si 
Such questionnaires might be given at any timm during or following the program ^od 
are particul^X'ly useful for formative evaluations. Later chapters discuss issu^i 
designing questionnaires to measure the actual effects of the progra® 
questionnaires that would be administered in pretests and posttests. This chapter 
assumes that yon have read Chapter 5 on the fundamentals of questionnaire design # 

If properly constructed and completed, program assessments can be ©Ofi 
sensitive thas. change data from pretests and posttests and can yield types of d^tia 
that cannot be obtained from pretests and posttests. If people reflect thoughtfully 
about a prog^.^m and Its impacts they can consider feelings and reactions to tbi 
program thm^ pretests and posttests cannot possibly measure $ and they may recognise 
subtle chans^s in themselves that pretests and posttests are insuf ficieB^Ly 
sensitive to capture. 

On the otJaer hand, program assessments by participants are also notorioti^Lj 
unreliable Invalid. Participants who like the staff of the program typicglLy 

rate all tYim characteristics of the program very positively. Participants who like 
a course or program invariably Indicate that the program had a greater and QOt^a 
positive Imp^ict upon them than the program actually had ^— especially when tbe 
program evalu^itlon is undertaken during the last part of a program when partlcipsAba 
may be partlc^Talarly enthusiastic about the program. 

y_s log Par t i c ipan t sT_ As s e sjfment 3_ 

Considesrlng the advantages and disadvantages of program assessments^ you c^ii 
profitably thra on several occasions i 

• You wmnt to know the participants' views on the different parts of t^e 
prog^mm- 

• You want to ask questions (about the staff, particular topics p atmosphe^^} 
etc.) that other kinds of questionnaires do not measure. 

m You want suggestions for changes and improvements. 

• You want to make imnediate improvements and cannot wait to compare pretext: i 
and posttests. 

• You w^re unable or failed to obtain pretest da^a. 

• You were unable or failed to obtain posttest data. 

• You bave insufficient resources to do more than a formative assessment 
the program. 

• You want to use more than one method of evaluating the program. t 
exmpXap you want to compare self ^assessment data with change data £ro«i 
pretests and posttests. 



51 SO 



ERIC 



Wr 1 1 i ng Que g tiong 



Nearly all the steps for writing quisticigii that e discussed in Chapter 5 also 
apply to these questionnaires " you shotil4 ipeeif^ the important features and 
outcomes to be measured ^ construct the questdonnairt/ r pretest it with other experts 
and participants J and if possible assess its ^iHibili-l^^y and validity* 

Your questions about the features of p^6|r^s #hkaould include questions on the 
following t 

m the progrm structure (the conveniea^iof the tttime, duration^ places etc*) 

• the staff and their skills 

• the topics covered 

• the class atmosphere Ccoatortables bo^inii Pi^isonalj etc«) 

• the interaction between the staff pa^ti^i^aants • 

Your questions about the outcomes of pt^^S^ins sho^Ixld include measures of all 
outcomes specified as important, if th^iicaiQ boe measured by this type of 
questionnaire. In additioni 

• Include open-ended questions ab^utthe t^#t\Bknesses of the program and 
suggestions for improvraent* 

m Avoid asking participants questions th^^ y cannot answer correctly. 
Farticipants may be able to as8es@ h^nhe ]^**^ogram has already affected 
themi some respondents may be able toieti^a^ae bow the progrM will affect 
them in the short term future* UQ%0&vmh f^w ^ii^ople can accurately assess 
how a program will affect them tn la^t^r yeara-' 

• Allow for negative change as vell^s pos^i'Xtive change* Too commonly 
evaluators bias their results by phf^eini qu^^tr-ions so that there can only 
be improvement # Consider the f ollo^iiPMxwpi^t^ i 

Biased I As a result of this coufSei bs^iueh CI clearer are your values? 



not at all clearer 
slightly clearer 
somewhat clearer 
much clearer 



Unbiased^ As a result of this coi^tiijarft ^ocour values less clear or more 
clear? 

much less clear 
. _ less clear 
about the same 
more clear 
^ much more clear 



• Use some open-ended questions, fot UmpX%f you might ask the respondent 
to describe any other effects of th#progrii^ ^ or to make other remarks - 
Often such open-ended questions provide a weal^WQi of insight and good ideas* 



n 



ERIC 



5hPQ^i_nj_Aesjonsa_ Gate 



A variity of different kinds of response eategoriea is available to ivaluate 
^'the features and eiEt comes of the program. Table contains examples of catigorias 

that can be used wLth many different questions • Thesa examples illustrate ittportant 
«^^haracterlstlca el ^eeponse eategoriesi 

• Usa tBi^onmrn oategorias that fit the quest io^a, 

• Use batw^^m four and seven categories « If ^s^ou use fewer than four^ you may 
lose ditfiil. If you use more than seven ^ people may have difficulty 
answiflng tha questions and the addition^al precision gained may not be 
real. If ^ht respondents are very young Ce.g.j 5 to 12 years old)i they 
mgy have dx-f f leulty with mora than 3 categor^S.aa • 

Q Use eatig&xiis that allow for both positlvs and negative change (queetlons 
6, 7j ind 8)i 

• Hake the d£.f farences between adjacent catego^^les approximately equali 

• If posiibLe, use the same response categoi^ies for several questions* The 
readir wil!L have an easier time« 



EKLC 



62 

53 



Table 6-1 

3^s^^l_BB p£_3iltnBtLt Response Cate^Qries 



To Measure Features of Frogrms 



1 « Did the teacher eneourage students to ask 
whatever questions they had? 



2. Did the taacher show respeet toward the students? . 



strongly agree 
agree 
neutral 
disagree 

strongly disagree 

none at all 
a small amount 
a medium amount 
a large amount 
a great deal 



3. Did the students beeome uncomtortable when 

sensitive questions or topics were discussed? 



altnost never 
sometimes 

about half the tim 

usually 

almost always 



4. Was the teacher enthusiastic about teaching 
this course? 



not at all 
slightly 
somewhat 
very 



To Mea sure Outcomes of Frogr^s 

5, What is your opinion of the overall program? 



very poor 
poor 

average 
good 

excel lent 



Did the course make yon lass likely or more 
likely to think seriouFly before having sex 
in the future? 



Did the course make your social life better 
or worse? 



Because this course do you now respect yourselt 
less or more? 



63 



much less likely 
less likely 
no change 
more likely 
much more likely 

much worse 

worse 

no change 

better 

much better 



a lot less 
a little less 
no change 
a little more 
a lot more 



54 



CHAFTIE 7 
DBSICTUC nom^Gl TESTS 



Many people , especially teachers ^ are aeeustomed to writing knowledge tests to 
grade students in the elassroomt Howevers knowledge tests that are used to assess 
the knowledge of individual students relative to one another may not be the best 
knowledge tests for assessing the ehange in knowledge of a group of students over 
time- Moreover i many knowledge tests are quickly prepared, contain ambiguous 
questions s and lack a reasonable reliability and validity. 

To develop a reliable , valid j and sensitive knowledge tests you should complete 
the basic steps described in Chapter 5: 

• Determine the important areas to be measured. 
0 Construct the test. 

• Pretest it. 

• Assess its reliability and validity. 

This chapter discusses special issues to consider when applying those basic 
steps to knowledge tests and assumes that you have carefully read Chapter 5, 

Using Exist-ig^ KnowiedEe Tests 

As discussed in Chapter 5, existing tests can provide a useful pool of 
questions J many of which may be valid if they came from carefully constructed and 
tested questionnaires. However, there are only a few tests that have been carefully 
developed. (References to them are at the end of the chapter, and a knowledge test 
that proved useful in the evaluation of several different programs is at the end of 
this volume.) If you do use existing testa, be sure that they focus upon those 
specific facts and knowledge components that you should be measuring. 

Selecting Formats 

The many different formats for knowledge questions that are widely used are not 
equally good. The section below briefly discusses them in order of preference. Of 
course, not all test designers would rank formats the same way. 

Some topics lend themselves to formats that are Inappropriate for other topics. 
Similarly, students vary as to which formats they handle best. Thus, you may want 
to use a mixture of formats. For example, to test knowledge of body parts, you may 
want to combine multiple-choice questions with diagrams requiring labeling, 

Hu 1 1 1 p le_ Choi c e 

Some multiple choice questions Include a stem (the first part of an incomplete 



5564 



EKLC 



stat^ent) followed by several possible answers (that complete the itatement). 
Other multiple choice questions include a direct question with several possible 
answers* 

For most purposes, well constructed multiples-choice questions are the best 
format. They have many advantages I 

• Their multiple possible answers reduces the impact of guessing. 

• If created properly* they have one and only one correct answer (which may 
be "None of the Above" or "All of the above." 

• Host students are familiar with the format. 

• Multiple-choice questions are relatively easy to answer . 
o They are easy to score. 

• They can cover a wide variety of topics # Contrary to popular opinion^ they 
can also cover major substantive points as well as specific facts. 

The main disadvantage with multiple choice questions is that their incorrect answers 
may inadvertently teach students incorrect information. This problem can be 
minimized by reviewing the test after its final administration and by emphasizing 
that multiple choice questions contain answers that are plausible but incorrect. In 
general, multiple choice questions have few of the disadvantages that characterize 
O ther £ orma t s . 



Alternative Response 

An alternative response question is a multiple^choice question with only two 
possible answers. For students with very limited reading and cognitive skills, the 
alternative response format may be more vallds because it requires less reading and 
is less complex. 

A disadvantage with this format is that simply by guessing, students will 
answer correctly half of the questions. This is especially a problem if you wish to 
rank individual students. It is much less of a problem if you are comparing a large 
number of students over time, because you will then be comparing mean scores; if the 
group Is sufficiently large, the number of students who guess many questions 
correctly will be balanced by those who guess many questions incorrectly. At any 
rate, more questions must be asked to obtain meaningful data. 



True/False 

True/false questions are best for measuring knowledge of topics that are 
unequivocally right or wrong. Often they can test knowledge about specific facts 
with few words and thus require minimal reading time. Moreover, if properly 
constructed, they are not complex and do not confuse readers. 

However, few topics In sexuality are based upon important facts which are 
clearly right or wrong. This produces several problems. First, creating a 
statement that is clearly right or wrong is very difficult. You may believe that a 
statement is clearly correct when in fact it is not. For example, the statement 
"All people can catch an STD" may be Intended as a true statement that counters the 
common myth that only certain types of people can catch an STB. But the statement 
is not true for those people who will never have any sexual activity, who live in 
rraote areas of the world that are devoid of STD, or who are involved in mutually 
monogmous relationships. 



65 

56 



SeeoQd, pgQpli Bsay aaswer some true/false queations corraetly bacausa they are 
not ioformed about posslblt exeeptiens or faators wbleh make the isiue more eomplex. 
Some may miss the questioa because they are informed about the eMoeptioms and 
greater eomplexity. Further^ in an effort to make true/false questions clearly 
right or wrongs test designers may add modifiers or phrase a statment in such a 
manner that someone ignorant about the true answer could guess the correct answer 
from the language of the question « 

Finally J true/false questions have the same problem as alternative ehoice 
questions; namely, students will guess half of thmm correctly. This problem can be 
reduced by adding a third option, '^on'^t Knowp" and/or by telling students that they 
will be penalised for guessing « The number of incorrect answers could be subtracted 
from the number of correct answers, while unanswered questions or questions answered 
"Don't Know" are not counted. 



Matching 

This format can be an efficient method of asking several questions with similar 
kinds of answers. Names of body parts, for example, could be matched with their 
functions » However, for several reasons matching is not recoraended as the primary 
format in a knowledge test. First, it limits the kinds of questions that can be 
asked. For example , all the answers in the right hand colwn must have a similar 
format (one word answers, parts of a diagram)^ Second, the contents of the right 
hand column must also be homogeneous. For example, if the left hand column needs a 
date and there is only one date in the right hand column, then the two can be 
matched without knowledge of the correct answer. To prevent this, numerous dates 
must be included and this limits the kinds of questions that can be asked « 

Fill-in-the -Blank 

This format has two flaws. First, determining whether or not some answers are 
correct can require experts, which is costly. Second, some answers may be 
technically correct, but may not be the desired answer. For example, the question 
"Columbus discovered America in ?" can be answered with "1492," "a boat," "the 
Santa Maria," or "a state of desperation." Sometimes the most informed students are 
more likely to provide nonstandard but correct answers. Thus, the person who scores 
such tests must be knowledgeable about the material and must be prepared to give 
credit to nonstandard answers. 



Label the Figure 

Asking students to label the parts of a figure or diagram may be the only 
reasonably direct format for assessing certain types of information. Moreover * this 
format is commonly used by teachers in earlier grades and thus most young people are 
familiar with it« 

Figures have two disadvantages; they can test only one kind of information, and 
they must be scored by people familiar with the material. 

Essay 



Although essay questions may be useful In normal classroom use, scoring essay 



questions from many students en pretests and posttests is too ioiprecise, too 
difficulty and too costly. Thus, this format is not recomiended for evaluation of 
knowledga ahangas . 

Selecting the Number of Queations in Each Content Ajea 

After determining both the important content areas to measure in your knowledge 
test and the question formatCs), you need to determine the number of questions to 
ask in each content area» As a basic rule of thumb, you should ask between three 
and five multiple choice questions for each content area. However, other rules also 
apply. If you have defined your content are&a very broadly, then you may wish to 
ask a few more questions in each area. If you have defined them narrowly, you 
should ask about three questions* If you have defined many content areas, you 
should reduce the number of questions in each area to avoid making the test too 
long. If you are using question formats that are easier to answer than multiple 
choice and in which guessing has a greater impact, then you should ask more 
questions. For example, true/false questions are easy to answer and guessing plays 
a more major role; you should include more of them. Kirkpatrick (1981) suggests 
that one multiple choice question is about equal to three true/false questions* 
However, you probably do not need to ask as many as 15 true/false questions in any 
single content area. 

Wr i t Ing^ Qua s t ions 

Writing good questions is an art, and consequently Increased practice improves 
that skill. Howevart there are numerous guidelines that can help people avoid 
common errors. 

Quas t i ons_ i_n_i^^_ F orma t 

• Ask questions about the most Important facts, not about trivia. 

• Hake sure one and only one answer is correct. 

• Use a vocabulary that all the students can easily understand (unless, of 
course, you are testing vocabulary). 

• Use simple sentence structure. 

• Avoid trick or misleading questions. 

• Be sure that questions are Independent , that answers to one question do not 
depend upon correctly answering another question. 

• Avoid double negatives. 

Hultiple Choice Ques^jons 

• Include as much of the necessary information in the stem as possible* 

• Hake the possible answers as short as possible. 
0 Include three to five possible answers. 

• Hake all of the possible answers plausible. 

• Avoid over-using "All of the above" and '^one of the above" beC4m8e they 
tend to confuse students. 

• Avoid having "All of the above" and "None of the above" be the correct 
answer more than half the time they are Included as possible answers* 

• Avoid overly complicated possible answers like "a and c above." 

67 

' ■ 58 

O 

ERIC 



Alternative Resgonee and TYue/ ?al§i Quegir.^ons. 



• Be sure one answer is eliirly be^^ter than another. 

• Avoid words which tend to aaki statements false ("alls" "alwaySj" "none," 
"never") and words which tend to xaake statements true ("sometimes," "under 
some eonditions," and '^ly") . 

• Limit statements to a single id^^^* 
m Avoid negatives . 

tetghin^ Quest 

• Keep the aontent homogenous. 

• Keep the number of itraiimall, 

0 Have more answer choicis thmn statements unless an answer ean be used 
twice. 

• Arrange the answers in a logical ^^nanner, if possible. 

Review inR the_ j^ejiuence of Correct tosj^egl 

It is easy to inadvertantly h^^^e several consecutive questions or a 
disproportionate number of questioas with the same answer. To correct this, list 
all the answers and then reordar quest ic^ns^ reorder answers within multiple choice 
questions, and /or modify the questions. 

CoLnjueMPE^ ^ Itpa Analysis 

An item analysis can greatly improve the testes quality and can be an important 
step in its construction. To conduet an ^^tra analysis you must administer the test 
to a group of people. Ideally) the gf^^up will include 30 or more people and wi.ll 
resemble the participants in the course both before and after the participants 
complete the course. Then the analysis h^ms two major steps. 

St ej_ 1 j__ Ana_l:y_z_e_ th.e Pis tr i bu tion of Ang ^ ^^r^s, t_o_ Each g^ues tion 

Observe both the level of difficultly of each question and the number of times 
that each response to a question Is chogfi&a as correct. If a question is so easy 
that most (about 90%) of the piople sn._swer it correctly, that question will be 
useless In measuring change betwien the pr^etest and the posttest. Easy questions 
should be removed or made more difficult,,^ unless you are using criterion-referenced 
tests . 

Conversely, a question that is sO difficult that very few.people answer it 
correctly may not help distinguish among tudents nor measure increases in knowledge 
over time. If the percentage of paopi# correctly answering a question is close to 
chance, then that question is too dif fici^l t . For example, approximately 50% of 
students will correctly answer a trua/fal^-^a question simply by guessing i if only 60% 
of the students can answer it corre c t L_ y , it is too difficult. Similarlyj 
approximately 25% of students will correct tly answer a multiple choice question with 
four possible answersi if only 351 of th# students actually answer it correctly, 
then it is too difficult. If you are not -^Tising a criterion referenced test and if a 
more reasonable percentage of student^ will not answer it correct ly^fter the 
course $ then you should rMOve or nodif y I t. 



EKLC 



5& 68 



Although questions that are either too easy or too difficult should sometimes 
be removed or modifiedp the entire knowledge teit ihould contain questions with a 
rather wide range of diffieulty. Lord (1952) provided a widely used chart for the 
ideal level of difficulty of different kinds of questions i 

Type oJ QuMtion Aver age_ DJLf ficu 1 1 v 

(Pereent Corre^st) 

5-choice multiple choice 70 

4^choice multiple choice 74 

3^ehoice multiple choice 77 

True/false or alternative response 85 

Hatching or completion 50 



Of course, this chart is only a guide and not a strict criterion that should ba 
rigorously followed. Although all questions should not be far too difficult nor far 
too easy, questionnaires can deviate significantly from this guide and still be 
valid , 

As indicated above, you should also observe the number of times that each 
incorrect answer is ielected by the students. If students never or rarely choose an 
incorrect answer in a multiple choice question, then that answer should be replaced 
by another incorrect but more believable answer. If students too frequently select 
an incorrect answer, the answer may be unfairly misleading and you should examine 
it. 

These guidelines for finding and removing or modifying questions that are too 
easy or too difficult are important if you wish either to rank individuals or to 
measure change over time. However, if you are using criterion referenced methods 
and if your purpose is to ascertain the percentage of students that meet specified 
criteria both before and after a program, then it is important NOT to remove or 
modify items simply beca ise items are too easy or difficult. If you did so, you 
could destroy the criteria that you or experts carefully constructed and reach 
incorrect conclusions about the knowledge of the students. For example, if you 
removed all knowledge questions that everyone answered correctly, then on later 
administrations of the questionnaire you would incorrectly believe that students 
were less knowledgeable about that topic than they actually are. Of course, the 
converse is also true " if you remove all questions that everyone misses, then 
later you would incorrectly eonclude that the students had performed better than 
they actually had. 



Step 2% Analyze the ReHabilitv of Each Question 

If the knowledge test is reliable, students who score higher mi i ^-^ L a:,.:? 

more informed about the content areas than students who score lower v:';rct^ ^ 

high scoring students should be more likely to answer any question Cwrr: than 
should low scoring students. If low scoring students tend to answer a question 
correctly and high scoring students tend to miss it, then the question may be poorly 
worded and unreliable or invalid, and you should consider improving it. 

There are two ways to assess whether high scoring students are more likely than 
low scoring students to answer a question correctly. The first method is simpler if 
you do not have a computer or do not understand correlation. However, it is 
adequate but not quite as elegant as the second method. 



S9 



1. Separate the 25% of the questionnaires with the highest total scores and 
the 23% of the questionnaires with lowest total scores from the remainder 
of the questionnaires* 

2. For each of these two groups of questionnaires, caloulate the number of 
correct responses to each question. 

3. for each question, calculate the mean of the high scorers and the mean of 
the low scorers and compare them. 

If more high scorers Answered a question correctly than low scorers, then the 
question is probably reliable. If more low scorers answered a question correctly^ 
then the question may be unreliable and you should eKamine it and possibly modify or 
remove it • 

The second method is easier and more valid if you have a computer and 
understand correlation. Correlate the correctness of each question with the overall 
test score. That is, correlate whether or not question 1 is correct with the total 
test score; then correlate whether or not question 2 is correct with the overall 
test score; etc. If the correlation between a specific question and the total test 
score is high, then students who answered that question correctly also scored well 
on the entire test, and the question is reliable. If the correlation is low, then 
the students who answered that question correctly were low scorers on the other 
questions, and the question is probably unreliable* 



Refer en ces 

EKistinR Knowled^^ Test s^ Oil Sexuality 

Allgeir, A*R, Sex Kaowiedge Survey . Midwestern PEychological Association Meetings 
1978. 

Alter, J., ^ Wilifon, P. Teaching Parents to Be the Primary Semality Educato rs of 
Their Childreni Guide to Pesiftning and Implementina Hultisession Courses , 
Atlanta: Centers for Disease Control, 1982* 

Clark h Hicks. Sex Information Questionnaire * Atlanta Adolescent Pregnancy 
Project, 1969. 

Kirby, D., a Alter, J. Knowledge Test, An Analysis of U.S. Sex Education Prog ramg 
and Evaluation Methods . Atlanta: Centers for Disease Control, 1979* 

Lief, H.I., 4 Reed, D.H. Sex K n owledge, and Attitude Test . Philadelphia^ 
University of Pennsylvania School of Medicine, 1972. 

Petersen, J*C.| Ryerson, W* , Morris, L*A, , & Senderowitz, J. The Sex Attitude and 
Know 1 e dg e Sur v e v Arizona, Behavior Associates, 1978. 



Designing KBowledRe Testj 

Kirkpatrick, J*S. The Mag ic of Structure . New York i Planned Parenthood federation 
of ^erica, 1981. 



70 



Lord, F,M» The relationship of the reliability of multiple-choice tests to the 
difltribution of item difficulties. Psychometrika ^ 1/ t 181^194. 

Morris, L., i Pitg-^Gibbon, C* How to Measure Achievement . Beverly Hills, Calif, • 
Bage Publications, 1978. 

Thorpdike, R., & Hagen, 1* MeagurmgPt_jmd_JvAluatiQn in Psycho logy and Education * 
Fourth Edition. New Yorki John Wiley & Sons, 1977 . 




EKLC 



QE^tM. 8 

DlSICTim QUESTIOmiLlBBS TO mASimE AniTODlSp VJXOEB, ATO f^LIMGS 



Attitudes about iexuality are important for at leait two reasons. Some^ such 
as self asteem (positive attitude toward oneself ) ^ we value intrinsically. In 
addition* a person^'s attitudes may significantly affect how he or she behaves. For 
exmple, a person's attitude toward contraception may affect his or her current or 
subsequent use of contraQeption, 

Many activities in sexuality education classes are directed toward attitudes. 
Borne activities may attempt to promote basic values such as the dignity of all human 
life or the ismorality of using physical force in sexual relations. Others attempt 
to help students clarify their own and their families' values about such issues as 
premarital sex and contraception. 

If your program has goals involving attitudes, values » or feelings^ you should 
not rely on simply measuring the program's impact upon knowledge or behavior, 
because changes in knowledge may not lead to expected changes in attitude or 
behavior. Likewise, changes in behavior may not lead to desired changes in attitude 
or feelings. Thus, you should measure your program's impact on any attitudes, 
values, and feelings that your goals address* 

You should, however, be cautious. There is a large body of literature that 
demonstrates that attitudes are poorly related to behavior and that many other 
factors play a far iLore important role. Very general attitudes are especially 
poorly related to behavior; attitudes specific to particular behaviors are more 
highly related to those behaviors. 

To develop reliables valid, and appropriate scales and questionnaires for 
measuring attitudes, values, and feelings, you should complete the basic steps 
described in Chapter 5^ 

• Determine the important attitudes, values, and feelings to be measured, 

• Construct the scales and questionnaire. 

• Pretest the questionnaire. 

• Assess the reliability and validity of each scale. 

This chapter discusses special issues to consider when measuring attitudes and 
assumes that you have carefully read Chapter 5. 



Selecting Important Attitudes and Values to Measure 

Commonly when deciding which outcomes to measure, you select those which are 
both important and have a reasonable chance of occurring. However, when measuring 
attitudes s you should consider both the values of the community and the needs of the 
students. Just as some cottmiuni t ies are very much opposed to teaching specific 
values, so others are very much opposed to measuring personal values. Members of 

72 

63 



ERIC 



the community may feel that measuring certain values may suggest to the students 
that alternative values are aceeptable* 

You should also consider the privacy of the students* Before trying to measure 
sensitive and personal attitudes or values^ you should be certain that 
questionnaires will r^ain anonymous or confidential. 

On the other hand, if people opposing your program have claimed that it 
destroys cherished values, then you may want to measure these attitudes or values in 
order to determine whether your program has affected them* In such cases s yau would 
be measuring important values that you hope are not affected by the progr«is not 
just values that you hope are affected. 

Using Scales Constructed by dthera 

Psychologists have constructed innumerable scales to measure attitudes and 
other psychological traits of individuals. Some of these scales^ especially those 
that are more psychological Ce.g.s selt esteem scales) may be useful to you as they 
exist J or they may evoke creative ideas of your own. However , you should be careful 
using them. 

• Relevancy 1 Borne scales may have titles that appear relevanti but if you 
examine the actual items or questions , you will find that they do not 
measure the concepts that you wish to measure. 

• Reliability and validityi Look for scales that have reliability and 
validity coefficients in the .SOs or higher. 

• Appropriateness I Many scales are designed for adults and are too difficult 
or inappropriate for young people. Their reliability and validity may be 
high for a different populationi but low for yours. Sometimes you can 
modify the scales slightly (e.g., lower the vocabulary level) and thereby 
make them appropriate. 

At the end of this chapter are references to potentially useful collections of 
existing scales. 

jgl^ct i ng^ thm Best, Seal e^s 

All scales summarise an attitude with a single number or score. Because 
attitudes frequently range over a broad continuum, a scale must also have a range of 
possible numbers; otherwise the scale would not be sufficiently precise. 

You can obtain these scores by asking a single question or a series of 
questions. However, there are several reasons to use more than one item and to 
combine the scores, First, each individual item is likely to measure partly the 
desired concept and partly other undesired factors, for example, some people tend 
to agree with items regardless of their content; thus, the first item in Table 8-1 
would measure not only students' self esteem, but also their tendency to agree. 
Other respondents may partially misread an item or focus unnecessarily upon some 
part of it and answer it differently than you intended . Providing several Itras and 
designing the items properly minimi^^ these undesired factors and increase both the 
reliability and validity* 



73 



Table 8-1 

A Likert Scale to Measure Self Esteem 



1* Overall, I ^ satisfied with myself. Strongly Disagree (1) 

^ ^Disagree (2) 

N eutral (3) 

_^Agree (4) 

Strongly Agree (5) 



2* I feel that 1 have sany good personal Stronglv Disagree (1) 

qualities « D isagree (2) 

^Neutral (3) 

A gree (4) 
Stronalv ^ree (5) 



3, I feel I do not have mueh to be Strongly Disagree (5) 

proud of. D isagree (4) 

__Neutral (3) 

A gree (2) 

Strongly Agree (1) 



4, I wish 1 had more respect for myielf # Strongly Disagree (5) 

^Disagree (4) 

^Neutral (3) 
Agree (2) 

Strongly Agree (1) 



5* At times, I think I'm no good at all. Strongly Disagree (5) 

D isagree (4) 

Hautral (3) 

^Agree (2) 

Strongly Agree (1) 



65 



74 



Seeoiidj an attituda or feeling may hava a number of different aspects, and you 
may want to sutmairize them. For example, attitudes toward premarital sex are 
usually quite complex^ most people are not simply for or against premarital sex. 
Their attitudes depend on the circumstanees I age^ sex, maturity, degree of 
eloseness to the partner, and other factors. Asking several questions and 
mxmmATizing them in a single score increases the likelihood of obtaining a reliable 
and valid score. 

If you are measuring a relatively unidimensional attitude, value, or feeling, 
then you should use about five items in your scale. If you're measuring a more 
complex attitude, then you should use more items, but rarely more than 12 in a 
single scale. 



Likert scales are the most popular type of scale to measure attitudes, values, 
and feelings. Likert scales are based upon the idea that people^'s attitudes, 
values, and feelings typically have both a direction (for or against) and an 
intensity (neutral to strong). Table 8-1 provides an exmple of a Likert scale that 
can be used to measure attitude toward selt (self esteem). Many other exmples are 
in the Attitude and Value Inventory in the appendix. They all demonstrate a number 
of important properties of Likert scales. 

Pisagr ee-aEre e res ponses . All Likert itMis include a statement and a set of 
responses that range from strongly disagree to strongly agree* Thus, they measure 
both the direction and the intensity of the attitude. Sometimes evaluators label 
the middle category "Undecided" instead of 'Neutral" or omit the middle category 
altogether to force respondents to side either for or against something. In 
general, you should not force people to make such a choice unless you have a 
particular reason for doing so. 

^ ^ t em 8 . If you use neutral or bland items, everyone may agree and then 
you have obtained relatively little Information. Thus, strongly worded items will 
be more informative. On the other hand, if you are using criterion referenced 
methods, strongly worded items may specify a higher criterion than you believe is 
necessary. 

Positive and negative items . In the table, the first two items are positive 
items; greater agreement with the item means more selt esteem. The last three items 
are negative items; greater agreement with them means less self esteem. When 
respondents answer numerous consecutive positive questions, they tend to give less 
attention to each item and to give the same answer to all the questions. These 
sequences of identical answers are called response sets. In contrast, when some 
itms are positive and others are negative, respondents tend to read them more 
thoughtfully and to give more accurate and valid responses. Moreover, some 
respondents have a predisposition to agree or disagree with items. Including both 
positive and negative items will reduce the effects of this predisposition. In sum, 
to reduce response sets and to improve the reliability and validity, include both 
positive and negative items. 

Scoring Likert scales . At the right of each response in Table 8-1 is a number 
in parentheses that represents the score for each response. This score gives 
information about both the direction and intensity of the attitude. Because the 
direction of items 1 and 2 is the reverse of the direction of items 3 to 5, the 
numerical scores assigned to the categories are also reversed (see the numbers in 

75 

66 

erJc 



psrentheses) • With scores ©n itemi 3 to 5 reverstd, 1 iiidicates a strong and 
negative self astee® and 5 repreaeiits a strong and positive self asteCT. With thtse 
seores, you ean simply add (or find the mean) of eaeh individual's seores. In this 
exmple, a total seore of 25 (or a mean of 5) would represent the highest possible 
self esteem; 20 (or a mean of 4) would still represent high salt esteCTj but with 
less intensity I 5 (or a mean of 1) would represent the lowest possible self esteem. 



Rating Scales 

Rating scales are another popular method of measuring attitudes* Their 
greatest advantage is their flexibility* You can use each question as a separate 
scale or combine several questions into multi-item scales. Rating scales can 
measure attitudes or feelings about a wide variety of phenomena such as dating ^ 
premarital sexual behavior, birth controls the opposite sex, interaction with 
parents, family regulations, clarity of values, and clarity of long term goals* 
Note that you can use them both to measure the respondents'^ attitudes toward some 
phenomenon (the first exmple below) and to measure the respondents'^ self assessment 
of their attitude toward srae phenomenon (the second example below) * 



How often should people use birth 
control if they do not wish to have 
children at that time? 



M-i 

^3 i 



I 



How strongly do you feel 
should use birth control 



that people 
if they do not 



wish to have children at that time? 



m 
4J a 

N 

o ^3 



to 

I 

CO 



7 



Rating scales have several important traits^ 

About S to 7 categories * If you have fewer than five categorieSs you may 
unnecessarily lose detail* If you have more than seven categories, the respondents 
may have difficulty choosing a category. Furthermore, the apparent additional 
specificity may be misleading* 

Lab e 1 s^- a t ^ t h e_ _e nd s _ of _ _t_h e c o n t inuum * Specify the continuum and tie down the 
end points by giving the end categories labels. If it is easy to do so, you should 
also give labels to other categories. Be sure that the apparent distances between 
adjacent categories and labels are equal. Also be sure the end labels are 
reasonable enough for some people to choose ; that Is, avoid extremes that no one 
will choose* 



67 7 6 




Scoring . For single-'ltem scalas^ simply use the nimerical scores. If you have 
multi^ltem scalesi seora thra the same way you seore Llkert scales, being careful to 
reverse the seores of any negative Items- 



Semantle Differential ScaleB 

Another popular kind of attitude scale, the semantic differential, also 
Involves ratings of feelings about a concept (Table 8^2). Semantic differential 
scales have the following characteristics; 



Table 8-2 

Semantic Differential Scale Used 
to Me asure At titud^ toward Contracent Ion 



Directions: Indicate your feeling toward contraception hy reading the pair of words 
on each line and qtilckly checking the line that best indicates your feeling. If you 
have no feelings about contraception, check the middle space* If your feeling is 
more similar to the word on the left, check a line closer to the word on the left. 
If it Is more similar to the word on the right, check a line closer t© the word on 
the right. Answer once and only once for each pair of words. 



good bad 

wrong ^ ^ right 

responsible _____ ^_ ^_ irresponsible 

fair __ ^ unfair 

strong __ ^ ^ ^ weak 

ineffective effective 



cold ^ _ warm 

dirty clean 



Adjectives and antonyms * Each scale specifies a particular concept and then 
contains a series of adjectives and their antonyms that might describe the concept- 
Any set of adjectives and their antonyms can be used, provided, of course, that they 
have some relevance to the concept In question. The adjectives and their antonyms 
are separated by either five or seven underlined spaces* 

Random order of adjectives * When creating a scale, randomly order the side on 
which the positive adjective is presented* For the same reasons discussed about 
Likert scales, avoid having all positive words on one side and all negative words on 
the other* 



77 



Beeause the aemantie differential foreei people to Gheose between single 
adjectives, it best measures general impressions rather than specif ie, complex^ or 
detailed attitudes. Specific attitudes are neasured better with a Hkert scale. 

Scoring Semantic differential scales may be scored similarly to Likert 
scales. Assign each line a numbers with 1 always representing the most negative 
attitude and 7 (or 5) representing the most positive attitude. Then add the scores 
for all the lines * or find the mean of all the scores. 

Previous research has demonstrated that many adjectives are aighly related to 
three basic and independent dimensions! potency Chow powerful or effective) * 
activity (how active) ^ and evaluation (how good or bad). Consequently, the semantic 
differential is good for measuring broads general feelings about something, but is 
not good for measuring specific attitudes or attitudes that do not primarily involve 
potency* activity^ and/or evaluation. 



B eh avl or a 1 In t en ti on s 

Behavioral intentions questions are specifically designed to better predict 
behavior. Evaluators often attempt to learn how people do or would behave in 
sexuality-related situations by asking about attitudes. For eKamplej consider this 
question which measures a general attitude but not an intentioni 



Two people who do not wish to have children 
and who have sex should definitely use some 
form of contraception. 



Strongly agree 

Agree 

Neutral 

Disagree 
- Strongly disagree 



Attitudes, however, do not always accurately predict behavior. Other factors 
such as norms, habits, peer pressures^ economic factors, personality factors, and 
special circumstances also influence behavior. Thus, evaluators ask questions about 
what respondents believe they would actually do in a given set of circumstances. 
Research demonstrates that this type of question better predicts behavior than 
questions asking about a general attitude. Consider the following example : 

















.-t 


t 
















m 
































© 






u 


















4 


5 


6 


7 



If you knew that you were going to have sex 1 2 
this week, how likely is it that you would 
use some form of contraception? 

This question is clearly more behaviorally oriented than the earlier one and would 
probably better predict behavior. Thus, it will probably better evaluate the impact 
of a program upon behavior than would attitude questions. Rmraber, however, that 
even questions about behavioral intentions cannot always accurately predict 
behavior, espeeialiy in the area of sexual activity mong adolescents. For example, 
many adolescents who believe that they will act responsibly when sexually involved 
do not do so when they actually become involved. 



78 



A diiadvantage of these questions is that combining several different questions 
into a single score is sometimes difficult, although you can describe different 
situations and find the average of these scores. 



Other Kinds of Scales 

Psychologists have developed several other kinds of scales that have been 
eomioniy used: Thurstone scales, Guttman scales^ and various sociometrlc scales. 
This handbook does not explain them because they are substantially more difficult to 
develop than the scales described above and thus are not recoiomended* However, you 
can read about them in the reference books listed at the end of the chapter. 

ConstructLng: and Pretesting the Scales 

Reliable and valid scales are surprisingly difficult to coMtruct* A question 
or item that is very clear to you may have very different meanings for the 
respondents. Thus^ it is especially Important to follow the steps described in 
Chapter 5 for constructing and pretesting questionnaires. 

In addition, if you are creating multi^item scales and have a computer on which 
you can easily obtain correlation coefficients between different items, follow the 
procedures below. If you cannot easily obtain correlation coefficients, you should 
seriously consider using existing scales that have been properly validated. 

All the items in a scale should measure the same trait and consequently, should 
be highly correlated with one another. 

9 Step 1: Create about twice as many itCTS as you need for each scale* 

• Step 2 1 Administer the questionnaire to at least 50 and preferably 100 
people. 

• Step 3 1 Calculate the correlations between each pair of items in each 
scale* For example, if you wrote 20 items for a scale, find the 
correlations between each of the 20 items and the other 19 items. 

• Step 4 1 Examine the correlations for each scale and threw out those items 
which are poorly correlated with the other items. Keep only those items 
that are highly intercorrelated. Be sure that you keep the correct number 
of items needed for your scale, for example^ if you want a scale with 8 
items, keep the 8 items that have the highest inter correlations . 

• If you do not have enough items that are highly intercorrelated, then 
improve the itws, add to them, and repeat this entire process. 

• Step 3: Review your final selection to make sure it includes a balance of 
positive and negative items. 

This examination of the correlation coefficients can be improved and simpli^fied 
by empfbying factor analysis, but you should attempt this only if you already have a 
basic understanding of factor analysis r Steps 1 and 2 would be the same. 

• Step 3 1 for each scale, one at a time, run a factor analysis on the items* 
Either limit the number of possible factors to one or allow more factors » 

: 70 



erJc 



but prevent rotation. The itsDifi with the highest factor loadings on the 
first (of only) factor should be the items that best measure the desired 
attitude. 

• Step 4: Throw out those it^s with the lowest ratings. If several items 
have similar ratings, you can throw out the poorest items and repeat the 
factor analysis. If several items still have similar ratings you can 
employ other criteria for keeping items (e«g*» apparent face validity). 

• Step 5 1 Review your final selection of itmmm to make sure it has a balance 
of positive and negative it^s. 



Ref _e r efkc jis 

Books Containing Attitude Scales 

Bonjean, CH., Hill^ R- J- , & McLmore, S. AacipJLp^ical Heasurmeflt^j. An Inventory 
of Sea l_e s _ and. Indic_e_s * San Francisco i Ghandleri 1967 * 

Buros, 0. Hental He asurement Yearbook s 6th ed . Highland Park, N*J.i Gryphon 
Press, 1970 * 

Chun, K.^ Cobb, 8*, & French* J*R.P. Heasur gj__o^_Pavcho l0[Ri_qal_ Assessment , Ann 
Arbor, Mich.: University of Michigan, Institute for Social Research, 1973, 

Miller, D, Handbook of Research Desien and Social Measurement . New York: McKay, 
1964. 

Robinson, J. P., & Shaver, R. Mea sure s_ o f__S_Q c i a 1_ Psycho 1 og icAl At t i t ud es * Ann 
Arbor, Mich.i University of Michigan, Survey Research Center, 1969- 

Bhaw^ M.E,, 4 Wright, J.M, Scales for the Measurement of Attitudes . New Yorki 
McGraw Hill, 196/. 



Books pn_ Attitude B e a 1 1 ng 

Babbie, E.R. The Practice of Social Research * Belmont, Calif, i Wadsworth 
Publishing, 1975. 

Henerson, M.E,, Morris, L.L., & Fitz-Qibbon, C.T* How to Measure Attitudes . 
Beverly Hills, Calif. i Sage Publications, 1978. 

Kerlinger, F.N. Foundation s of Behavioral Research . 2d ed. New Yorki Holt, 
Rinehart^ and Winston, 1973. 

Miller, D.C. Handbook^f. Research Pesign_ and. Socjal, Measurment * 3d ed. New York^ 
David McKay Co. V Inc., 197/. 

Thorndike, R.L., & Hagen^ l.P. Meaaurement and Evaluation in Psychology and 
Education . 4th ed, N^ Yorki John Wiley & ions, 1977. 



80 

71 



ERIC 



GHAPTgp 9 

D£SIG»IHG QUEBTZ0ra4n^S TO ^^^Ul Bra4¥I0R AD ^lUS 



Many educational programs are ultimately concerned with influencing long term 
behavior- Courses may focus upon knowledgei but the supporters of the course often 
hope or believe that improved knowledge will subsequently improve decisiomaaking and 
behavior. Many sexuality education progrms have as explicit or implicit long term 
goals: 



9 the increase in communication with parents 

• the delay of sexual activity 

o the reduction of unwanted pregnancy 

• the reduction of sexually transmxtted diseases, 

m more generally > the improvement of social and sexual relationships. 



If your program either explicitly or implicitly has goals regarding behaviors, then 
you should measure its impact upon these behaviors* 

Sexuality educators are often concerned with several different components of 
behavior - 



9 the amount and/or frequency of individuals' behavior 
# the skillfulness or effectiveness of their behavior 
e the individuals' feelings about their behavior. 



For example, consider communication between two people about sexuality. Your 
program and your evaluation may be concerned with 1) whether any communication took 
place, and if it did, how frequently and for how long, 2) whether the participants 
used important communication skills^ and it they did, how effective the 
comsaunication was, and 3) how comfortable the participants were. 

Similarly, consider the use of birth control methods* The program and 
evaluation may be concerned with 1) how frequently birth control methods were used, 
2) how effectively or properly each method was used^ and 3) how comfortably and with 
what feelings participants obtained and used a method* 

The previous chapter discussed methods of measuring attitudes and feelings 
about different behaviors* This chapter focuses upon methods of measuring the 
amount and the effectiveness of the behaviors* This chapter assumes that you have 
carefully read Chapter 5 on the fundamentals of questionnaire design and that you 
are following its major steps for constructing questionnaires! 

• Determine the important behaviors to be measured. 

# Construct the test. 

• Pretest the questionnaire. 

# Assess its reliability and validity. 



81 



Detarmliilng the Important Behaviors to Be Measured 
Boaiml and FQlltiGal Coats and Benefits 

Despite the obvious need to evaluate the Impaet of progrms upon behavior, all 
of us doing researeh In sexuality ediicatlon need to weigh earefully the eosts and 
benefits of doing such research. We need to eompare the potential costs to the 
student respondents^ to the program, and to the eommunlty with the potential 
usefulness of the information we Gollect* 

Costs to the students . When Klnsey condueted researeh In sexuality, he took 
great care to Insure that the anonymity of the information he collected was not 
breached^ He interviewed people indl-^dually and anonymously, coded the answers 
directly with a code that few people know, and kept the coded data in a safe that 
only a few could open. However, seKuallty education programs have difficulty 
maintaining such rigid controls* When students complete questionnaires in 
classrooms, when questionnaires are collected in class and/or sent through the 
mails, and when key punchers keypunch the data, some of these safeguards are lost. 
Even If you take all reasonable precautions, no set of safe^ards is foolproof. 

If only a few people conduct research, then the chances of confidential 
information being released remin small. When many researchers ask thousands of 
students to complete sensitive questions, then the chances substantially increase. 
Therefore, when making a decision about whether to ask a sensitive question, 
seriously consider the cost of some student answering that question honestly, some 
other students seeing the answer, and the reputation of that person being affected. 

Costs to the program and the community . Any question about sexuality may 
offend some people, but questions about individual sexual behavior are more likely 
to evoke a negative reaction from parents or concerned community groups. If you 
obtain appropriate approval from school boards or other boards. If you notify 
parents and obtain their approval, and if you follow the other steps described In 
Chapter 11 on administering the questionnaires, then the chances of a negative 
reaction from parents or the community are greatly reduced. Nevertheless, you 
should still consider the possibility of negative reaction and 1) be able to justify 
the need for each question and 2) be sure of strong administrative support based on 
careful review of the instruments. 

Quality of the data . The validity of questions on sensitive niexual behaviors 
may be lower than the validity of other questions. This, of coarse, will depend 
greatly upon the age and e^^erience of the students you're evaluating- When you 
decide which behaviors to measure, you should consider the quality of the data and 
the actual benefit of that data to you If it Is not valid data. 

Aspects of Behavior That Can Be Measured 

There are at least three different aspects of any behavlori 

• The number of times the student engages In the behavior 

• The skill with which the student engages in the behavior 
® The comfort level the student feels during the behavior. 

Often more than one of these may be Important and should be measured. 



82 

74 



ERIC 



Gonf ideiitialitv snd Validity of BehaviQr Questions 

A major problam with questions about sexual behavior is that students may not 
be willing to answer thmm honestly and thus thay may be invalid. However $ you ean 
reduce this source of error by following the suggestions below in your questionnaire 
design and by following the guidelines in Chapter 11 when administering the 
quest ionnaire« 

Pri n_t_ gujLgt i cm^ OjL only one s jd^ of the jaRe * This reduces the ease with which 
other people can see previously answered questions. If you print questions on both 
sides of the page and if the pages are stapled together * then other people can more 
easily see previously printed pages facing up. 

Bury sang i t i y e quest i o n s amo ng_ _o_t h e r _g u_e s t_i on s . If sensitive questions are 
grouped separately^ then others can more easily see them and may be \nore tempted to 
look at someone else's questionnaire. In particular, sensitive questions should not 
be separated at the bottom of the last page. 

Prevent ^ _c omj> 1 e t i qn_ t ime f r om pr ed ic t ing s exual ac t iv i t v . If a questionnaire 
has many questions that musfe be completed only by those who are sexually active, 
then students may believe that students who finish the questionnaire first are not 
sexually active and that those who finish last are sexually active* You should 
prevent this by 1) having all students read and answer all questions or 2) including 
many insensitive questions in the questionnaires 

Use th e r andom res po iise_ _t_e chniaue * The random response technique is one method 
researchers have used to get a better estimate of sensitive behaviors while 
absolutely assuring anonymity. As an example, suppose you have two identical 
glasses filled with water to the s^e level. You know that the temperature of one 
glass of water is 40 degrees and you wish to know the temperature of the other 
glass. Rather than measure it directly with a thermometer, you could add the 
contents of one glass to the other and measure the temperature of the combined 
contents. If that water is 50 degrees, you could conclude that the water in the 
unmeasured glass had been approximately 60 degrees. 

To use the random response technique to evaluate a sexuality education progrw, 
you should give all respondents two questions, a sensitive question and an easy 
question, that have the same possible responses (yes or no^ a Likert scale, etc.). 
Each respondent then uses a random method (perhaps the flip of a coin) to determine 
which question to answer* Thus, no one except the respondent can know how any 
particular respondent answered any sensitive question, becau m no one knows whether 
that respondent even answered the sensitive question. For exmple, you could ask 
the students to answer one of the following questions yes or no, depending on their 
toss of the coini 

Heads I Question Ai Were you born between January and June? 
Tails I Question Bi Have you ever had sexual intercourse? 

No one knows whether the student who answers yes is talking about a birthdate or a 
sexual experience^ However, the alternative questions were designed so that the 
frequencies of answers to both the sensitive and the alternative questions could be 
determined. The researcher, knowing the probability of students getting heads or 
tails, and the probability of students being born during the first 6 months, ran 



S3 



etatistieally determine the frequency of seKual intereeurie in the group* 

In the example above » 

(1) (2) (3) 

(Z answering "y^^") ^ (% answering question A) (2 born between Jan and June) + 

(4) (5) 

(% answering question B) i% having had sex) 

The answer to term 1 eomei from the data. If a eoin is flipped to determine whether 
Question A or Question B was answered, terms 2 and 4 are both .S. If half of all 
people in your class were born in January through Junef then term 3 is #5* With 
simple algebra > you can find the answer to term 3, which is what you really want to 
know. 

This method requires 1) considerable additional work and 2) rather large sample 
siies. However, if you have reason to suspect that respondents are answering 
dishonestly because of fear of exposure, you may wish to use this method* 

Using Other Techniq ues for Enjiancing Validity 

Hak e_ _v q g a b_uJLa_r v __ g 1 # ar_ and _un_am_b_igupu_g ■ In the area of sexuality, many words 
are poorly defined and should be either clearly defined or avoided* 

Poor I When did you become sexually involved? 
Better I When did you first have sexual intercourse? 

Use appropriate response categories * If the behavior you're measuring is 
discrate, then ask for a frequency of that behavior during a specified time period: 

How many times did you have sexual intercourse during the last month? 

If the behavior is not discrete, then provide appropriate response categories i 

When you talk about sexuality with your . almost never 

girl/boyfriend, how often do you listen sometimes 
to her/his feelings? . half the time 

. usually 

_ almost always 

Include. j'J)j?e 8_ No t „ ApjXy^' as^ response category * Many behavior questions may 
not apply to all respondents. To avoid confusing the respondents, be sure to 
include "Does Hot Apply" as a possible response in all appropriate questions. 

Hake items un_Ldimenaionaj. * Too frequently researchers include questions that 
ask about two separate phenomena in one question i 

Poor: How often do you have discussions about sexuality with your 

parents and friends? 
Better V How often do you have discussions about sexuality with your 

parents? 

How often do you have discussions about sexuality with your 
friends? 



84 



Wyj.j^__g^ej_ti.an6 tha^^^ answer aceuratelv * Research in many fields 

strongly indicates that people poorly rOTember their own past behavior* The aore 
frequent the behavior, the poorer our memory of the frequeney, for eKample, we may 
easily remember the one time we have gone to Europe, but we may not be able to 
remmber how many movies we went to in the previous year* In general, questions 
should deal with recent information! 

Poor I How many times did you have sexual intercourse during the last 
yearf 

Betters How many times did you have sexual intercourse during the last 
month? 

Avoid or modify leading questions . When measuring sexual behavior, you may 
want to ask many questions with socially desirable answers* You should either 
rewrite the question so that its social desirability is not so obvious or eliminate 
the question altogether because it would be invalid* 

Poor: Do you ever discuss anything about sex with your parents? 
Better: Last month did you discuss sexuality with your parents? 

Poor I Last month how many times did you have sexual intercourse 

without using any form of birth control? 
Betters Last month how many times did you have sexual intercourse? 

How many times did you use some fom of birth control? 

A 1 low stud en t_s_^t Q _ skip_ ir r e 1 evan t __q_ue g t i qn^ * Many sexual behaviors are 
hierarchial a student who has never kissed someone has probably never engaged in 
pettingi an individual who has never engaged in petting has probably never engaged 
in intercourse* This hierarchial principle can be used to have respondents si 
questions which are inappropriate and may CTibarrass them* 

Question 35: Have you ever kissed a girl/boy? 

If no, skip to Question 40. 

Question 36 i Have you ever had sexual intercourse? 



Question 40 i How often do you go to the movies? 

If you use skip patterns, you risk students'' concluding that some students are 
sexually activt^ (because they completed all the questions and took longer) and 
others are not (because they skipped many questions and finished quickly) - 
Therefore, use 'chis technique sparingly, 

C r^e ate separate versi on § __Q_f_ _ t h_e_ _q uestionnaire for _ma las and f ema 1 e s , 
Administering only one version of a questionnaire is often easier than administering 
two versions. Nevertheless, if you want to ask males and females different 
questions, or if you want to avoid the awkwardness of terms like "boy/girlfriend," 
then you may want to consider having male and female versions of the questionnaire- 
Different versions may also increase eonf Ldentiality . 



85 



MeasuriPE Skil 1 s and Effect ivene s s 



It is often very difficult to measure either skills or the effectiveness of 
using those skilla. In particular^ it is difficult to measure communieat ion, 
decisionmaking 9 or other interpersonal skills with questionnaires* A number of 
people have tried and have not been fully successful • 

One partially successful approach is to ask questions about frequency: how 
often has the respondent actually used the various important components of 
communication, decisionmaking, or other interpersonal skills* For example^ to tap 
decisionmaking behavior^ you could ask questions about the frequency with which 
students consider alternatives, obtain additional information, weigh the outcomes, 
and take responsibility for the outcomes* This was the approach that we used in our 
evaluation* A scale based upon this approach is included in the appendix. 

Another approach is to focus upon the last event of a particular type and ask 
numercus questions about that event* For eKamplg, if you are trying to measure the 
effectiveness of using a particular birth control method, you could ask several 
questions about how respondents used that method on the last occasion* Xf 
sufficient time has passed, you can even ask whether the woman became pregnant. 

If you are giving questionnaires to a small number of students and can 
carefully score answers to questions, then you might consider writing scenarios s 
asking the students to describe the factors they would consider in making a 
decision^ then scoring the answers* Such questionnaires must be carefully pretested 
and the judges who are doing the scoring must do so blindly | that is, they should 
not know which questionnaires are pretests and which posttests or which belong to 
the experimental or control groups* 



Fret e s t inE the ^e s t ionna ir e 



When you pretest the questionnaire with a small group of students, you should 
focus your questions on their perceptions of the sensitivity of the questions: 

• Were they comfortable answering the questions? 

• Do they think other students would answer the questions honestly? 

• Could the questions be reworded so that they would be less sensitive, yet 
still measure the same behaviors? 

• Which questions did they feel were especially likely to elicit incorrects 
socially desirable answers? 

• How could those questions be reworded? 



As^e^jine RelAabjJ.itv and Validity 

The reliability and validity of behavior questions may be reduced by several 
factors, namely, the respondents'' i 

• fear of being exposed 

9 reluctance to admit even to themselves that they had engaged in some 
behaviors 

• feelings of guilt about some behavior 

• reluctance to remember or focus upon past and painful experiences 

• desire to boast and enlarge upon their sexual activities* 



8 



78 



e 



ERIC 



Thus, asiessing reliability and validity of behavior questions is partieularly 
import ant • 

Assessing Reliability 

The best method of assessing reliability is the tast-^retest method » You should 
administer a questionnaire to the students on two different occasions about 2 weeks 
apart • Some kinds of seKual behavior are sporadic and change daily or weekly. If 
students are having sexual intercourse^ for exmple, that behavior probably varies 
considerably from week to week. To measure the teat--reteat reliability of questions 
about sexual intercourses you should probably administer the questions only a couple 
of (days apart # 

Other methods of assessing reliability are not likely to be effective, because 
you cannot ask several slightly different questions about the same behavior and have 
the respondents answer them independently* For example, if you ask several slightly 
different* questions, all of which seek to measure the amount of sexual activity the 
previous week, the respondent will probably recognize their similarity and answer 
them all the same. Thus, the answers will not be independent measures^ and you 
cannot use split-half or multi^item methods of reliability. 

However, you may be able to examine the internal consistency of different 
questions t For example, you might include the following three questions i 

Have you ever had sexual intercourse? 

How many times did you have sexual intercourse during the last month? 

If you occasionally have sex, how comfortable are you getting some form of 
birth control? (Include '^oes Not Apply" as a response.) 

Numerous combinations of possible answers would not be appropriate- For example, if 
the respondent said that he had never had sex, but had sex four times last month, 
and that the last question did not apply, then one or more of his answers must be 
invalid a 



Assessing Validity 

As Indicated in Chapter 5^ assessing the validity of behavior will probably be 
difficult. Occasionally you can obtain evidence for different kinds of validity. 

Face validity . If your questions are truly clear and straightforward, and if 
the respondents are willing to answer them honestly, the questions may have 
considerable face validity. However, as discussed in Chapter 5, face validity is 
the weakest kind of validity and is especially weak if the respondents may be 
reluctant to answer sensitive questions. 

Cr i t er ipjL ^a 1 i dj t y . Criterion validity is the best form of validity, but you 
cannot obtain it for many behavior questions. Occasionally you can obtain 
criterion-validity on contraceptive use by obtaining independent data from clinics 
or other contraceptive sources in the area and comparing this data with the 
questionnaire data. Sometimes you can also compare student data on communication 
with parents with parent data on co^unication with their teenagers. However s if 
the parents' data does not support the students' data, you don't know whether the 



79 87 



pSLTmntB^ data or the students'' data or both are invalid. 



Cgg 8 1 r u gt ya_jj.dj.tjg , Oft'jn you can hypothesize that different groupi of people 
will engage in different behaviors* For example, freshmen should engage in less 
sexual behavior than seniors. This enables you to usa construct validity in many 
cases. Unf ortunately p freshmen and seniors are very different in many ways and thus 
the mere fact that freshmen have reported less sexual activity than seniors does not 
provide good evidence for the fact that you are measuring what you want to be 
measuring • 



88 



CHAPTER 10 
SEUgGTING A SAMPI^ 



In resaarchj a population Is defined as the collection of people (or other 
phanQmana) that are of Interest. The first step in evaluating aaxuality education 
or any program Is to define the population of Interest. If you want to generalize 
to all teenagers p then the population is teenagers. If you are solely concerned 
with those teenagers In a particular geographical area or in a particular school^ 
those teenagers form the population of interest. 

In many cases^ the population Is too large to study In Its entirety and a 
portion of the population^ called a samplej is studied in order to make inferences 
or generalisations to the total population. Therefore, the second step is to decide 
whether to evaluate the entire population or to select a sample. If the population 
is small for example, if a school program has been given to only 100 students 
questionnaires can be given to the whole population. If thousands of students have 
taken a program, querying the whole population may be too costly or time consuming 
and you can save time and effort by carefully selecting a sample of students to 
participate in the evaluation. 

Two factors determine the overall quality of the samplei Its size and its 
randosmess. Both are important. If the sample is perfectly random but very small, 
you cannot generalise to a larger population. For examplep if 1,000 students 
participated in a sexuality education program and If you Interviewed or gave 
questionnaires to only 10 of the students, you could not meaningfully generalize to 
all 1,000 students because the 10 students mlgjit have special qualities that make 
them different. 

Similarly, If the sample Is very large but not random, the results may not 
represent the total pppulatlon. For example, researchers in a major study of 
sexuality once collected more than 100,000 questionnaires, but their sample was not 
chosen randomly, so that In spite of their enormous sample sl^e, thair results 
cannot be used to make meaningful inferences to any larger population. A random 
sample of only 500 respondents would have been more useful. 



Selecting a Sample Size 

Other things being equal, large samples are better than small samples for two 
reasons: they decrease the amount of error caused by sampling and they increase the 
power of the test. Both of these are discussed below. However, they also cost mora 
and may be more difficult to obtain. Thus, when selecting a sample sizej you need 
to consider the amount of acceptable sampling error, the desired power, the 
feasibility, and the econoirdc and social costs of samples of different sizes. 



89 



Sampling Error and Power 



Whenevar you take a samples you introduca a certain amount of sampling error* 
Por examplej if you toss a fair coin 10 times ^ you will not always get exactly 5 
heads and 5 tailsi often you will get 6 heads and 4 tails* 4 heads and 6 tails or 
some other combination. Similarly if you have 1*000 students in a high school and 
randomly select 50 for a sexuality education class and another 50 for a control 
group* the two groups will probably not be identical even before the course begins | 
one of the groups is likely to be slightly brighter, more sexually active* or be 
different in some other way. The difference between the two groups is caused by 
smpling error. 

If you select 50 students for both the control group and the experimental 
group* the difference between the two groups will probably be less than If you 
selected only 2 students for each groups This illustrates the general principle 
that larger smples tend to have less sampling error. 

Reducing sampling error or its possibility will reduce the probability that you 
will make either of two errors* By increasing your sample sl^e* you become less 
likely to erroneously conclude that the difference between an experimental and 
control group (or between the pretests and posttests) is due to the program when in 
fact it is due to sampling error* Assume* for example* that the mean number of 
correct answers on a knowledge test administered after a program is 85 for the 
experimental group and only 80 for the control group. If there are only 2 students 
in each of the groups, this difference might have been caused entirely by the 
differences in the students before the progrm began. Concluding that the program 
had caused the difference would be wrong. In contrast* if you had randomly assigned 
50 students to each group, this difference is less likely to have been caused by 
smpling alone, and you are less likely to reach an Incorrect conclusion. Thus* 
with a larger sample, you can be more confident that a difference is actually caused 
by the program and not by smpling error. 

Conversely, increasing the sample size decreases the probability that you will 
incorrectly decide that a difference between the experimental group and the control 
group is not significant when in fact it is. In the exmple above with two students 
in each group, you might have incorrectly decided that the difference in mean test 
scores was entirely due to sampling error when In fact it was caused by the progrm. 
However* if you had had 50 students in each group, you would probably have 
accurately concluded that the progrra was effective. Decreasing the probability of 
this type of error is called increasing the power of the test. 

Statistical principles demonstrate that you can be 95 percent certain that 
sampling error will be less than or equal to* (1 .96} ( standard deviation) /( square 
root of the swple size). 

Error ^ 1*96 sta ndard deviation 



N 

For example* if the standard deviation is 5 and your sample is 100, your sample 
estimate of the mean will be within (1.96)(5)/10 or ,98 of the true population 
mean. 

Standard deviations are measures of the extent to which the scores are spread 
out. They are more fully discussed in Chapter 14, However* even when you 
understand standard deviations* you cannot know what the itandard deviation of the 



90 



ERIC 



data will be until you have collected the data, and of course It is too lata to 
determine the sample size at that time. Thus, you have to make an intelligent 
guess. You can improve your egtimate by observing the standard deviations In 
previously eondueted studies or by asking consultants. One helpful hint: if you 
have a dlchotomous variable that is scored 0 and 1, the Mximum possible standard 
deviation is -S. 

Normallys you would choose an acceptable error, and than calculate the needed 
sample size. Using simple algebra, the formula above becomesi 



That is, you should i 

Ip Multiply 1.96 times the estimated standard deviation, 

2. Divide this product by the acceptable error. 

3> Square this quotient. 

If you have a dichotemous variable and your estimated standard deviation is .5, and 
if your acceptable error is .1, then your sample size should bei 



As you can see from this formula, larger samples sizes produce less error than 
smaller sample sizes* However^ increasing sampla size carries diminishing returns. 



proportional to the square root of the sample size. For example, if you increase 
the sample size by a factor of 4 (e.g., from 100 respondents to 400), you will 
reduce the error in sample statistics hy a factor of only two. 

Although the formula above gives you the proper sample size for a specified 
amount of error^ many evaluators simply use general guidelines for sample size* 
These are presented below- 



sample size - ((1.96) (standard deviation) /(acceptable error))2 



(C1.96)(,5)/(.l))2 ^ 



96 




Sample Size 



Comments 



25 



This size is about the smallaat size that warrants doing 
statistical research- Effects of the program would have 
to be rather large to obtain statistically significant 
result s> 



100 



This size is substantially better than 25 and is commonly 
worth the additional effort to collect and analyze the data. 



200 



This size is about the largest that warrants additional 
effort unless the evaluation is a major project being 
completed with great cara. 



1000 



This size is necessary for national studies of major 
significance and requires substantial funding. 





Paasibilitv 



Sometimes it is not possible to select a sufficiently large sample | the number 
of people in the progrM may be small and thereby limit the sample m±mmi you cannot 
obtain the names or addresses of previous participants | parents, school boarda, or 
other bodies will not provide their consent; participants will not agree to 
participate; or other very practical matters will limit the ample. These problCTis 
are especially likely to arise when you wish to administer questionnaires to a 
control group that has not participated in your progrm. 

You need to consider all these potentially limiting factors in advance and 
suraount them as best as you can* Those factors that can not be surmounted should 
be described in your final report* 



Soc i_a_l_ _an_d_ _E co nom i c C o s t s 

If the number of program participants and other similar factors do not limit 
the sample size, then you need to consider the social and financial costs of 
increasing the sample sise. Although the gains from increasing the sample size 
diminish with increasing sample size^ many of the costs increase proportionately. 
Doubling the sample size may well double the cost of copying the queationnaires ^ and 
the person hours required to completes codei and keypunch the questionnaires. It 
may also double the risk of students seeing the confidential answers of other 
students* The person and computer time required to analyse the datm will probably 
also increase » though not proportionately* 

In sum* your selection of a sample size should reflect all these factors. 
Although the optimal sample size will vary from one study to another, the 
guidelines provided above may be helpful. 



Improving_the Raadpmness of a Sam^lm 

A sample is considered random if some method that is completely unrelated to 
any characteristic of the population is used to select the sample. For examples if 
you listed all the names of the students in the population of int-erests selected 
each name one at a time* and flipped a coin to determine whether to include that 
person in the sample^ you would create a random sample* If the s^ple is large 
enough, the students in the smiple should have characteristics very similar to the 
entire population* However, if your population of interest is everyone in a school, 
and you selected a sample by specifying everyone in study hall, the sample may not 
be representative, because it would exclude all students who do not come to study 
hall: students^ for examples who are on work programs and students who are 
preparing for college and do not have time to take study hall* Sucli students might 
be affected by a sexuality education program differently than otheys. Thus* the 
study hall sample could bias any conclusions drawn about the impact of the program 
upon the entire school. 

There are several methods of randomly selecting students. One of the best is 
to determine the desired sample sise, assign everyone in the population a number, 
and then using a table of random numbers in a statistics book, select students one 
at a time until you have the desired sanple size* Another good way Lm to determine 
the desired sample size, divide the population by that number to determine "p"* 
arrange all names in alphabetical order, and then assign every pth person on the 
list to the sample. For example, if the population has 2,000 people and you want a 



ERIC 



sampla size of 200 peoplfi) you w^^uld select eveiy 10th parson* 

Randomly i^l^eting Indivt cSuals la often not feasible befiause it requires 
calling students out of class— Another method is to administer ^uestionnalfts to 
all the students In a^rt of a ra^mdom sampla of classes in the school. The risk hera 
Is the sama as in the study h^^ll example abovei you must hm Surt the sample of 
classes is repraientatlva of tMia total population* The sample should Includa 
proportionate nuobat^ students according to Intelliganca^ facial background, 
grade levels popularity among pe^rs, etc. 

Impirc^vlng the Rasponsa Ratag 

Whan rasearchars try to e-ollect information from a spaclfiad sample, they 
normally are unable to Cpllict time desired information from all the mimbers of that 
sample. For axample, if they ^all a quas tionnaira to a sample of peoplej tha 
addresses of some of the enveloptea may be incorrect. Soma of the people may fiil to 
complete and return th# questi^ onnairas . Others may not treat the questionnaire 
seriously, and answer <5U©8tions S^n a flippant manner so that the qusitlonnalra must 
ba discarded* Tha parcintage of members of the originally apaeified sample that 
provides usable inf orimtion Is d^^lnad as the response rate* 

You should try t^ ebuin high a response rate as possible, because a high 

response rata will help you obtai « a sample size closer to your destrad sample size* 
Even more important^ a high resptoonse rata will help maintain the randomness of your 
sample. If your response Me. 1^. une^ectedly lo5^, you may not h^Va inough people 
in your sample* and gv^n more dLmportantj the people who do not r^gpond may differ 
significantly from thOsewho respond and thereby bias your analysis. For 

axample^ people who £ooM ar& ^nd during the seKuality education ooursa may have 
learned leas and may be less Ilka J.y to return nailed questionnaire^* ThuSj If your 
response rata Is low, yO\i might i^Tncorrectly conclude that your styd^nti learned mare 
than they actually learned. A3.ternat Ively * students who war© Initially less 
knowledgeable about seKU^Itty may^ have learned the most^ and thase students might be 
less likely to return th% qusitlominaire. In this case^ if you got a low response 
rate* you might Incofrectly co-melude that your students learned less than they 
actually did. The importanfc poln^ is that your sampla should be a tandom selection 
of the people in your p^pulatiorL^ and if many people fail to return questionnaires^ 
you may not know how thi% will af^Pact your analysis- 
Response rafes of fiO or 90 percent are considered vety good iii ioclal sctance 
research. Wiien questiontt^trei ar^ miled to people, the responga ratas are more 
commonly around abouc JO or 60P percent. Responses rates lower than that are 
generally unacceptable* 

When you adminiac^jr ques t.t_ onnaires to a captive group, you ihould get usable 
Information from most all of t^^a group. However, if you send quegtionnalres home 
with your students or thtcuih the mall to parents or other members of the com^nity, 
there are several ways to Increase a response rate that might otha^lse be lowi 

• Telephone the teiponden^ts or send them a notice in advana^ Indicating that 
they will be rgC^Mng a questionnaire In tha nail and thaC It is l^ortant 
for them to cofflplita and ^^eturn it. 

e Offer to share w^ththem i^r publish the results of the survey, 

• Make all correspofliince a^i^ad the questionnaire very professional. 

• Ask a principali ichool ^soard, or some other respectable person or group to 
endorse the stydy. 



- 85 93 



• Send a followup questionnaire ot poiteard or both to th€3Si who do not 
respond* (If the questionnaire is amenpous ^ you must provide a separate 
return pQitcard indieatiog t ±iiy hav^e completed and mailed the 
queetiostisirei) 

• Telephone those who do not respond* 

When your response rates are loWj yoii should determine as btst as you can how 
your respondents differed ft^m the nonrespp^^ents* This is diffiealt because by 
defaLnition you d© not get the questio^nt irei from the nonrespondests. However^ 
there are still tvo approaches you can fpllo^^^-. Firsts compare the c^sracter is ties 
of your completed iflmple vith characterise ici of tbe population, For iKTOple, if 
most of your cottplited isttple is of one fm^fe ^ but most of the popuLation is of a 
different race, tlm there i^s a bias; ot if most of the sample consists of seniors, 
but most of the ittidints in thi course at^ s^^ptaores mnd juniors i tt^en there is a 
bias* Second, mU a gfeater effort obtaia information from some of the 

nonrespondents aM then see if they differ^ froi the reipondenti. for example , 
carefully obtain information from about 10 20 noaicesponding studeatSi and see if 

they were more likely to have dropped out of school, to have become pregnant * or to 
have done sometbing else w^ich would bias your dmta. If they simp ly moved away 
because their par ^sts changed, tmployaanti th^^^ this might not be a significant or 
important bias. 

When people evsluate sfe^uallty educatioiQ progrmmsj they typicalL^ silect first 
a program^ then t ifimple oC partieipantg j and th^m evaluate tha impact of the 
program upon th^si pax ticipants • If the program is iuccassfuli tha evaluator 
probably writes suecessfu lly publishea aa. article* Others then rmmd the article 
and conclude that sexuali ty education i » suecee af ul • If tha program is not 
successful, the evaluator probably does not w^lta and publish an artl<;le to inform 
others about the progrsm^s lack of suc^es^^ Or, if the evaluator ^ois write the 
articles journals way be reluctant to publisii. It* Smch inconsistency produces a 
bias in the literature ^ only successes published and read. Th^is bias can be 

reduced by randomly choosing m sample of prC»fcTains and then publishing the results of 
those programs regardless caf whether they are fouad to be succeiifuX* Of course^ 
this requires th% cooperation of a j©nrn^s.l to publish negative results or 
non-findings • 

Ref ejepg 

Kishj L* _Surye^ Jmlin^ * New York* Wiley * 196b. 



CHAP^ 11 



Few mathodology teKts provide guides to administering quastionnaires , but our 
experience has clearly demonstrated that poor administration can cdmpletely 
invalidate good questionnaires and destroy the evaluation. Thus, administering 
questionnaires properly is just as important as constructing them well* 

Qbtaining Approval 

Because sexuality is a sensitive and controversial topic , it is often important 
to obtain approval to administer the questionnaires from parents s school officials 3 
human subject review boards , and/or other appropriate organisations* There are 
several reasons to obtain this approval* Firsts parents should have the right to 
prevent their children from reading and answering sensitive questions about 
sexuality. Similarly, school officials and teachers should also have the right to 
prevent children in their school or classrooms from completing questionnaires on 
sensitive subjects* Several people who have evaluated sexuality education programs 
in this country have learned this principle the hard way. They failed to obtain 
apprcval from parents and school personnel, and when their work was discovered, the 
evaluations were scuttled. Obtaining approval in advance would probably have 
prevented such a draotic consequence. 

Second, obtaining approval prior to administering questionnaires can help 
prevent potential abuse of sensitive data* Very rarely, if ever, do researchers 
abuse the collection of sensitive data, but obtaining approval may add safeguards 
the researchers overlooked . 



Finally, approval should be obtained in some places because it is required by 
law or goverraental regulations. In California, for example, parental consent must 
be obtained before sensitive questionnaires are given to students* Similarly, 
federal regulations require that various kinds of approval if federal funds are used 
in the evaluation. 

In our experience administering thousands of questions to students, 99% of 
parents gave permission, less than 1% denied their permission, and none complained 
at any later time* To obtain permission from parents ^ write them a letter Including 
the following! 

• The rationale or need for the information and the study* 
m An accurate sumaary of the questionnaires, 

• A statement of approval received from school boards or other official 
groups . 

9 A summary of special procedures that will be followed to ensure 
voluntariness and anonymity* 

• A statement of appreciation for parental support. 



iKperietiee has indieatad that it is not necessary to send the entire 

questionnaire home to parents. Doing so may needlessly raise questions. On the 

Other hands if any parents wish to see the questionnaire, they should eertainly be 
given that opportunity - 

At the same time^ it is essential to provide an accurate summary of the 
questionnaires. If the questionnaires include questions about sexuality or 
behavior, the letter to the parents should state this. If the letter to the parents 
does not accurately describe the quest ionnaires ^ parents will have the right to 
complain. If the letter accurately describes the questionnaires, parents will have 
no reason to complain and will probably not do so. Some evaluators have made the 
description more concrete by including an example or two. In sum, if the letter is 
well written and its information is accurate and valid, then most parents will 
provide their consent. 

Selecting a Test Administrator 

Methodologist s differ over the question of whether to have a teacher or some 
other person, administer the questionnaires. If teachers administer and see the 
test, they may either consciously or unconsciously "teach to the test," For 
example, they might cover some of the facts asked on a knowledge test immediately 
prior to the test and thereby bias the results of the test. In addition, the 
teacher's very presence may bias the students, particularly when the students are 
being asked to rate the teacher or the class. If the study is evaluating several 
different classes with different teachers^ having a single skilled test 
administrator give all the tests will ensure that the administration is the same in 
all the classes and avoid the risk that one or more teachers will fail to properly 
follow the directions, assure anonymity, answer questions about the test 
appropriately, etc. 

On the other hand, there are also reasons for having the teacher administer the 
tests, first, just as students often treat a regular teacher with more respect than 
a substitutej some students will treat a questionnaire far more seriously and answer 
the questions more reliably if the teacher, instead of an unknown person, 
administers the questionnaire. Second, some students will answer honestly some 
sensitive questions only when the teacher has gained their trust and is 
administering the questionnaire. Third, teachers who have considerable knowledge 
about sexuality may be better able to answer questions about the test than a test 
administrator who is not knowledgeable about the field. Finally, employing a test 
administrator each time a questionnaire is being administered may simply be too 
costly. 

In sum, the researcher needs to consider the pros and cons and the individual 
abilities of both teachers and others to administer the questionnaires. In some 
situations I a compromise may be the optimal solution. Especially when trust is an 
issue, the teacher can eaphaslze the need to complete the questionnaires carefully 
and explain that responses will be anonymous, and an independent skilled test 
administrator can then actually administer the questionnaires and ensure that the 
correct procedures are followed. 

Selecting Dates 

When employing an experimental or quasi-experimental design, the researcher 
will typically want to administer the questionnaires at the beginning of the program 



96 



(pretests), at the end of the program, and possibly several months or even years 
after the progrMi (posttests). 

^Timing of pretests is important. Espeeiaiiy for questionnaires containing 
sensitive questions about behavior, the researeher may want to delay administering 
the pretest until eoneiderable rapport has been established between the students and 
the teaeher*^ ror exmple, if the program lasts an entire semester, the researcher 
might administer the questionnaires at the end of the first week. Bueh delays are 
feasible when the program lasts several weeks or longer and when relatively few 
topiGs are covered during the first few days. 

eimilarly, posttests may best be given earlier than the elose of the program. 
For example, if the progr^n ends near the end of the semester or the academic year , 
then the first posttest should be given about a week before the end. During the 
last days of a semester or school year, students are excited; they are ready to 
leave school and have waeationi they have many other tests. Consequently, they are 
less likely to answer questionnaires carefully and validly. Again, this 
consideration applies only to long programs. 



lnsurlng__yoluntarjjiesg While, Encouraging CQoperation 

Because of the personal and sensitive nature of some of the questions, 
students'' cooperation in completing questionnaires must be entirely voluntary. 
Assure them, both in the written questionnaire directions and verbally during the 
instructions, that they are not required to complete any questions that make them 
uncomfortable. If the questionnaires are administered as part of a class, emphasise 
that their decision to skip part or all of the questions will not affect their 
grades . 

On the other hand, the results of the study will be more valid if a large 
percentage of the selected participants do complete the questionnaires. Thus, you 
should encourage (but not pressure) students to participate and should stress the 
importance of their participation in the study. You can do this by emphasising that 
their answers may affect future programs. 



Ensuring Anonvmity 

The personal and sensitive nature of the questionnaires requires that they be 
truly anonymous. No one (Including you, the administrators) should know who 
completed specific questionnaires. To this end, describe all of the fol lowing steps 
before you give students the questionnaires . Di s cuss ing the steps before 
distributing the questionnaires will not only help them follow the steps, but will 
also assure them that anonymity is being treated very seriously, 

• Physically separate the students so that no student can see the responses 
of any other student. Separating students may require rearranging desks or 
using a larger room. Neither teachers nor test administrators should walk 
around the room, if doing so enables them to see the answers of the 
students . 

• Stress to the students that no--one " neither the student nor anyone else 
~ should place any identifying information on any of the questionnaires. 



• Ask students to usa ndfrnal laad pencils to complete chi questlonQalres. 
Red, grteni or purple Ink can destiroy anonymity of s questionnaire^ Supply 
pencils, if mecessary* 

• Give all students identical envelopes into which they will place their 
quea tionnaires b ef or e turning them in . Once questionnaires have been 
turned in, mix up the envelopes so that no one knows whose questionnaire is 
on the top or bottom. Alternatively, let them put their questionnaires 
anywhere in the middle of the pile of questionnaires or have them drop 
their questionnaires into a large ballot box» 



ITs ing Idgnt if icat ion Numb er s 

Whenever you need to match pretests with posttests, you must use some method 
that enables you to pair each individual's pretest and posttest, yet maintains the 
confidentiality of the questionnaires « The best and most con^^n method is to assign 
unique identification (IP) numbers to the students and to their respective 
questionnaires. There are several different ways of doing this| each has its own 
advantages and disadvantages. 

In the first method, the researcher randomly assigns each student an ID, keeps 
a list of students' names and their respective ZD numbers, and then during both the 
pretest and posttest, gives each student the questionnaire with his or her ZD number 
wltten on the questionnaire. This method is relatively simple and works well 
methodologically . However , the researcher could take a given questionnaire, observe 
the ID number on it, and then use the list of nraes and ZD numbers to discover which 
student answered those questions thus destroying the assurance of anonymity » 
Technically, this problem can be overcome by making sure that the people who see the 
completed questionnaires never have access to the list of names and ID numbers. 
However, some students may not fully trust this process | they may realise that If 
the researchers wanted to, they could bring together the list and the questionnaires 
and destroy their anonymity. 

In the second method, students select their own numbers and then put those 
numbers on each questionnaire that they complete. The administrator can instruct 
students to use the month and day of their own or a parent's birthday (June 7 would 
be 0607), the last four digits of their phone number, or the last four digits of 
their social security numbtr. There are at least three problems with using 
student^selected numbers. First, the number may not be unique (e.g., two or more 
students may have the same birth date). This problem can be minimised by further 
dividing students into reasonably small groups (e.g., Mr. Jones' second period 
health class), and by using other Identifying information on the questionnaires 
(e.g., the person's sex, age, or handwriting) when duplication of a number does 
occur. The second problem with this method Is that someone may recogniEe the 
number. For exmple, a student may recognise the birth date or phone number of a 
friend. However, this problem is relatively minor because other students or friends 
should never see the completed questionnaires, and the researcher who does see them 
will not know the birth dates, phone numbers, etc. of the students. Moreover, no 
one will know whether an ZD number is a birthday or some other number. The third 
prob lem is that student s may forget what number they select ed (e .g ., their own 
birthday or their mothers' birthday). You can overcome this by specifying yourself 
what number they should select, lllmlnating the student selection of the type of 
number will, however, slightly increase the chances that someone else will see the 
number and identify the respondent. 



o 

ERIC 



Giving Dlreetione and Angwsrlng QuegtjiiQns 



All dlreettoris should be written on the questionnaire* However, since many 
students fall to read directions, they should always be paraphrased verbally* If 
the questionnaires are carefully designed and pretested, then all the directions 
should be clear^ and no students should fall to understand the questions. However , 
Invariably some students become confused about some questions. A students 
misreading the directions and answering all questions in an Incorrect way will 
decrease the validity of the data* In generali the test administrator should do 
whatever will maximize the validity of the data« Of course, the administrator 
should not answer any knowledge test questions* Moreover^ whatever policy the 
administrator establishes for answering appropriate questions should apply to all 
administrations of the test. Providing help on the pretest, but not on the 
posttest^ may introduce bias in the analysis. 

In addition^ we have found that when the teacher stresses the importance of the 
study and of careful answers, the students do treat the questions seriously | when 
teachers fall to stress care, some students are careless. 

Allowing Sufficient Time 

When measuring the effects of sexuality education programs, there Is rarely a 
need to administer timed tests* If students are hurried, or even If they believe 
that they may be hurried, they may spend less time on each question and answer less 
carefully. All students should have sufficient time to answer each question 
carefully. 

Attention span varies with different groups of students and with different 
kinds of questions* When students complete tests for grades, they may be able to 
concentrate an hour or more, but when they complete questionnaires that have little 
impact upon them, their attention span Is shorter* Attention span Is probably 
greatest for questions about their behavior that they find Interesting, shorter for 
questions about their attitudes, and shortest for questions testing knowledge. If 
students cannot complete the questionnaires in 20 to 30 minutes, administer them in 
2 or more days, If possible* 



EKLC 



99 

91. 



CB&FIiR 12 



This handbook has fraquently emphaslEed that an Important prlncipla in 
methodology is that phenomena ^- whether outcomes or progrms — should be measured 
or evaluated In two or more ways that are maximally different* 

Foori Using two different questionnaires to measure the outcome of a progr^* 
Better: Administering questionnaires and conducting in-depth Interviews* 

If the maximally different methods all provide evidence for the same 
conclusion, then you can have much greater faith in that conclusion. Rarely will 
results from different methods be identical. Howeverj if one of the methods 
provides evidence for one conclusion and the other method provides evidence for a 
different and conflicting conclusion, then one or both of the methods must be 
incorrect, and you cannot have much faith in your conclusion. 

Two maximally different kinds of methods are obtrusive and unobtrusive methods. 
Previous chapters have discussed the use of questionnaires to evaluate programs. 
Questionnaires are obtrusive because they intrude into the lives of the respondents, 
requiring their knowledge, their consent, and even their full cooperation. In 
contrast are unobtrusive methods that do not intrude Into the lives of the 
participants and do not even require their knowledge, consent, or cooperation. 

Developing unobtrusive methods often requires great creativity- To demonstrate 
the wide range of creative possibilities, this chapter will Cl) briefly describe 
several examples of unobtrusive measures used In other fields, and (2) discuss 
possible uses in the analysis of sexuality education programs. 

y_s Ing, Uno b t j u s i v e Meas ur e s i n_ 0_t h_er Fields 

Problem ; When television first became popular in this country, many people 
wanted to ascertain the impact of television upon reading habits. To have 
administered a questionnaire to a random sample of ^ericans would have been costly, 
and previous studies indicated that such studies were invalid becau^^e many people 
forget what they have read, and others exaggerate how much they have read. 

So lu t i on I The researchers sampled several libraries in the country and 
observed the changes over time in the number of different kinds of books that were 
ehecked out. They also observed the changes over time In book sales. 

Prob lem t A museum wanted to measure the popularity of different exhibits but 
didn'^t want to administer questionnaires or directly observe the number of people 
viewing the exhibits. 

Solution g The museum Installed an Inexpensive tile floor that wore out rather 



93 
100 



quiekly. Every 6 months they measured the thickness of the tiles. If thm tiles 
were thinner than average in frant of a particular exhibit * they eoncluded that more 
people had walked or stood in front of that exhibit. To assess the popularity of 
different exhibits with different age groups^ they also counted the number of finger 
prints of varying heights on the glass in front of the exhibits. 

Problem i A county in Kentucky abolished the sale or importation of alcoholie 
beverages In the county and then wanted to know the actual impact upon drinking 
habits of the population. Obviously people would not have answered questionnaires 
honestly. 

Solution ; Firsts county officials simply observed the change over time in 
citations for drunk driving. Second^ both before and after the abolition of 
alcoholic beverages^ thmy counted the number of empty bottles of different kinds of 
alcohol in a randm ample of trash cans in the county « 

Using ynobtrujiyj. Methods, to Measure Cojatracejt ive > 
yre^nancv » and_ STP Rates 

Many programs want to 1) increase the effeGtive use of birth control methods , 
2) decrease the wount of unprotected sexual activity (and thereby decrease unwanted 
pregnancies) j and 3) reduce the number of cases of lexually transmitted disease. 
These goals are clearly not the only goals of sexuality education programs * 
However, many programs consider them very important and use them to justify their 
funding. Thus, it is essential to adequately measure the impact of programs upon 
pregnancy and STD rates. 

Whereas previous chapters have discussed methods of measuring these rates with 
questionnaires, this chapter focuses upon unobtrusive methods of measuring these 
rates, namely, collecting data from clinics* 

go 1 1 e_c t i ng^ Pfl j ^pro^ _S cho o 1 Clinics 

In a few high schools, measuring the impact of a prograpn upon contraceptive 
use, pregnancies, births, and STD's is relatively easy, because most girls who want 
some form of contraception, who become pregnant, or who get a sexually transmitted 
disease go to the high school clinic for initial treatment and/or r^eferral. Thus, 
you can ask the health clinics in these schools to simply tally the number of 
observed pregnancies, births, and cases of sexually transmitted diseases each year. 

In many schools, a large proportion of girls who become pregnant go to term and 
are either visibly pregnant while at school and/or obtain a medical excuse from the 
school clinic to drop out of school before and after delivery. In these schools, 
the health clinics can tally the number of births. 

Because such clinics also have access to the names of the students who attend 
the sexuality education programs, they would be able to 1) determine the numbers of 
pregnancies, births, and cases of STD each year before and after the sexuality 
education program and 2) compare people who did and did not take sexuality 
education. That is, you can use a quasi^experimental design in such settings to 
obtain important information unobtrusively* 

However, monitoring the use of contraception, the number of pregnancies, and 



101 

94 



'nhm n\mhm-mi. easea of STD li much more diffleult than mdnltofing blrthi. Ifany 
^Itti^ is*^contrae«^ have early mliearrlages or abortions, or have STD'^s 

#r :;r>^^^^^^ staff person In the sehool. Thus, you will normally have 

4;xfflc?ilt¥ #*italnlng valid data frim aehool clinics. 

:^nfl^^?^nt±ality tn always a serious problem, because a teenager's use of 
aomfm^oMptM^m^ pregnancy, or cata of STD should never be made public* To assure 
that cCfif^deiitlallty Is maintained, only appropriate people should be allowed to 
vl-^^w th^ clinics^ records, and all research data with personal Identifying 
*r\f^i^f l^ion should be kept absolutely confidential. 

dal:^cttng pata from Nonachool Health Clinics 

In some communities the vast majority of teenagers who obtain a medical form of 
t^^traceptlon, who become pregnant, or who get an STD go to a small number of 
doctors or clinics. You can detemlne If the students In a particular school visit 
a llMted number of doctors or clinics by adndnistering an anonymous questionnaire 
to seniors, asking them where they have gone or would go if the need should arise. 
If most Students would visit a limited number of doctors or clinics, and if all of 
these doctors and clinics are willing to participate, then there are two different 
ways to collect contraceptive, pre^ancy, and STD data* 

One method Involves creating lists of all female students in your school each 
year, then looking up each student^s name in the files of each clinic or doctor to 
determine whether that female student obtained contraception, got pregnant^ or got 
an STD that year. Often, when looking up a name, it is convenient to sea whether 
that student attended the clinic or doctor during any previous year. When checking 
the records, write down the date of the visit, so that 1) additional pregnancies or 
cases of STD by the same person can be recorded, and 2) redundant visits to two or 
mere doctors or clinics for the same problem will not be counted twice. Count the 
numbers of people who obtained contraceptives, got pregnant, or had an STD, and add 
across clinics and doctors* Then compare rates before the Implementation of a 
sexuality education program with rates after its implementation* 

If you wish to compare students who take sexuality education with other 
scudenfcs who have not taken sexuality education, divide your list of female students 
accordingly- Then find the rates for each group* 

If a doctor or clinic from whom you need data will not allow an outsider to 
view their records, you may be able to hire a staff person currently working for 
that doctor or clinic to review the records* This procedure may also help maintain 
the confidentiality of the data* 

A second method of collecting contraceptive, pregnaficy, or STD data involves 
asking the cooperating doctors and clinics to collect a very small amount of 
additional Information during the Intake interview from all teenagers getting 
contraception, having a positive pregnancy test, or having an STD. If the doctors 
and clinics ask which school the teenager attends, they can provide you with the 
numbers of teenagers from each school each month or year that obtained 
contraception, had a positive pregnancy test, or had an STD. In addition. If the 
doctors and clinics ask whether the teenager completed the sexuality education 
course In that school, then they can provide data on the numbers of teenagers both 
taking and not taking sexuality education who are seeking treatment or 
contraceptives. 



95 102 



ERJC 



The fiyst mithod dlseussed afeove hms two major advantages. Pirst^ it can be 
i»p *l(imented momths or tvaa ytmrs after people have attended tht clinioi provided the 
^g^^ords of the ollnic patients ar^ain on file. In oontrast» the seeond method does 
not allow the collection of data for years prior to the data eolleetioQi baoauae it 
dep^^nds upon patlettts answering questions when they come to the clinic. Second^ the 
iit^Mt method is probably more reliablee If you can gain accoss to the records, and 
i£ you have sufficient time to look up all the names, it should be a relatively 
^tr^^ight forward task producing reliable data. However, if you are evaluating 
seiw^eral schools, or if you are evaluating schools with very large numbers of 
stUMdents, the first method may be too time consuming and costly. 

CqI ~ lectine Data from District or County Statistics 

In a few schools you can use district or county statistics to evaluate your 
pjfO.^gram* If a school district implementing a sexuality education program is 
co^msv^ent with a county or a health district, then you may be able to obtain 
Dff^dciai estimates of contraceptive use, pregnancies, abortions, births, and cases 
of STD from the county or health organizations. That is, other people will already 
tiav^m done all the work for you. However, you can only use this data to compare the 
sta^tistics before the sexuality education program was implemented with the 
std^^lstics after the program was implemented, and rarely are sexuality education 
pfo^S^^s i^pl^^nted quickly in entire health districts. 



jLm^Ajt a t i on s on Fr eg nan c v_ . B Ir tjh , and STD Da t a 

Because pregnancy, birth, and STD rates vary substantially in schools from year 
to ear simply because of chance factors, obtaining several years of baseline data 
and several years of post^program data is critical, particularly when the impact of 
the sexuality education prpgrm is likely to be small. For example, if a course 
auc«tt^essf ully reduced the number of teenage pregnancies among the students in the 
cou^sse by 30%, and If 30% of the student body completed the course, then the 
pre ^g nancy rate for the entire school would decline by only 9% (30% x 30%) because of 
the program. The rather small impact of a rather successful program would probably 
b% ^obscured by the annual changes In pregnancy rates caused by chance or systematic 
fa^^^^ors - 

Any procedure for collecting pregnancy, birth, and STD data will invariably 
und^sr estimate the actual number of pregnancies, births, and cases of STD because 
jOfis^ pregnancies, births, and cases of STD will certainly be missed. However, this 
is ^oot a problem If 1) the researcher is comparing rates over time, and 2) the 
per^^entage of missed pregnancies, births, or cases of STD remains constant over 
tiin^s. For example, if a school actually has 100 pregnancies per year before a 
pro ^g ram but only Identifier 80% or 80 of them, and if the same school actually has 
70 p^regnancles after a program but only identifies 80% or 56 of them, then the 
rea^^archer would properly conclude that pregnancies have declined by 30%. In sum, a 
lys^^ematic error in the collection of data will not affect the estimated percentage 
chanK^e. 

When comparing the contraceptive, pregnancy, and STD rates of students who have 
tak^^n sexuality education and students who have not taken sexuality education, you 
ahot^^ld be certain that the two groups are similar in other respects. For ex^ple. 
It upwould be completely invalid to compare all students who have taken sexuality 
edu^satlon with all who have not taken sexuality education, because those who have 
takaan sexuality education are probably older and more sexually experienced. Thus, 

.... • . • .... . . . , 

96 



ERIC 



fo^ iHaffipli, you s^^ould coapart Ireshttiift vh*^ liavt ta^ken sexuslity education wi«h 
f reitoen who have ii»»t| sopheiiDres who hav^ Wkiw^ geMual^ty educatloa iDphomof^s 
who have not i etc. SittiLarly, you should ebflwttol f or othM importsmt f titers th«t 
night exiet gueh as ^a^t, seK) and iraligioA* 



.^g&PE PnobtrUB ive., J^^^ to^^^S ^lluate Other GQals 

Gomprthenslve saxu&Lity adulation pre>g^4ii fre^^ently have mmy geaU Other 
than increasing uet of eopttaeeption and teda^i^^i unwanted pregnaneies and casis c^f 
STD, and some of t^heae €an bt ttassured by Ufl^^btrusivfc methodi* If a goal of yotmr 
program ie to Inereataa met bu discussion of sexuality and deeredee t^ploiti^^a 
thinking, you a©uld eount bcth before and aprogr^sa is implemented, the numb «r 

of seKiit Dr "dirty*** c^imetkUov jokes In the tt^ceker rooms of the tfchoojj in t^aa 
hallways, or on the - bathMWwalJLs* if a godl S^ito improve skills in oo^ynieati^n 
and oonflict rgiolut^lon, yc^v could tally the W«n.ler of f ights on the ichgoi grounds 
before and after p^^-ografli ifflplementat ion * i goal is to reduce indppropriatee 

public sexuil bahavi*^ors you eoiild observe aiiy e^fa4!!|ii tha amount of flecking ^Ln 

tha hallway B. If a .^^oal ii to improve thought ft^ilcommutti eat ion about saKyBiity, yomu 
could monitor tht ■BLumber of articles In the lehool sewspmpar^ the ntiber c>»f 
discussion groups ^bout sixualitys or th# t^^atient «f sexuality othtyschocsl 
functiona. 

Nona of these u^^obtraaivi methods Is ^pti^^l&r ide«l , Each of thm has one o»x 
more problems that m-^^y reduciits validity. U^^^mt ^ imEObtrufiive ia%thod8 can a 
irery useful as taeth-i^ds that are maKimaliy diffMMl: from obtrusive methcda and thims 
provide an iadepande:s=Bt aoutciof evidence for tl»«iucces« of a program* toushouLd 
use your creativity and diii|n additional Ufc^flltrusi-^a methods suitabLifor tt^e 
evaluation of your ommn program* 



Webbp L,J,, & Campbell, d^t. K^DnreaeJblve^ jfej^^^^^^ in che Social Scij&fl€_es. 2d e4^ • 
Boe/.on: Houghtc^n Miff lint 1981 . 



104 



CBMMm 13 



This chapter diacussis proeadures f or pr epar in^^ ques t ionnaira data for 
statist leal analysis. Thi next chapter discusses ap^^cific statistics to use when 
analysing the data* 



po_iM_i_hJ_ Analysis^ by Hiind_ Yer^us^^^Qmputer 

When analysing qufintitative dataj first decide w^^ether to analyze the data by 
hand or to use a computer. Doing it by hand is probabl^^ best whenever all of the 
following conditions exist* 

The data contain a small number of cases respondents) . 

The data contain a small number" of variables pec=' case. 

You desire only simple statistical analysee . such as frequencies and mean 
scores. 

Using a computer is uiually nacessary whenevej 1) the data contain a large 
number of cases or variables, or 2) you wish to use - more complex statistical 
analyses such as tests of significance, correlation »^oef f icients , or reliability 
coefficients. If you daeida the data or data analysis ^^equire a computer, but do 
not know how to use one, you can usually hirg a universl^ty graduate student or other 
consultant to conduct tha statistical analysis. Xf you »do hire someone, make sure 
that that person has had considerable experience wit^Qi the kinds of data and data 
analysis that you will havit 

Coding Quest ionnaire _pajA 

Coding the questionnaire data is the proceis translating answers on the 

questionnaires into numbers (or occasionally letters) • that can be subsequently 
keypunched. Coding can take a subetantial amount of time and may introduce 
addit ional err or s * Therafore, whenever poasibla^^, you should design the 
questionnaires so that the keypuncher ean punch t rhe data directly from the 
questionnaire without somione eoding all the data on s^p^sirate sheets of paper. 

Pevalop a Cod_eb_Pok 

The first step in coding the data is to develop a cs^debook, the directions that 
translate a set of questions into a matrix of numbers or letters that are to be 
entered into the computer. The codebook must accuratel^^ specify 1) how each answer 
on the questionnaire should be translated into a let ter or number^ and 2) the 
correct column placement in the data matrix for each qu#^stlon or variable. 

• If possible, use only numbers in the data matrix^ « 



99 105. 



ERIC 



• Assign each quee tioasalipg ^iiiparate ID nussmber. It is often eonvanient to ; 
use sequantlal mutabars, ox tfyou are mstehlEng pretests and posttests^ to 
use as the ID nuniber the felrMate or othe^-^ number that is your basis for 
matching . 

9 Code important inf o^matiOiothat may not written on the questionnaire: 

whether the quest ionnait^i i^gpretest or po^^ttest, whether the respondent 
is part ©f the exper l^i^^sU^ control gro^up, which class and teacher the 
reipondent had^ etc. 

• Be sure that every anaw%r §Ktk assigned on^m and only one number. 

• If respondents write ^tiifars different from the possible choices that 
you provided in the qU€att0^iiilfei and if yo^* wish to code these answers^ 
then 1) give each of th^ d Jlfferent eodej ^divide the answers into groups, 
and give each group a s^par^^ti code * or 2) c&wmbine them all into an "Other" 
category. 

• When you have missing 4€^S)U3e a missing data code equal to some number 
that can never be a valid #^pr* Coders a^f ten use ^9^ as the missing 
data code for one colatnft #iMrs and '^99'' m^m the missing data code for two 
column answers. Other kiada pf missing data ^ such as "Does Not Apply" can , 
be given the same misainj Jata code, or if you might want to analyze them 
separately, a different nUmb^fiuch as ^S' or ^98^ that cannot be valid. 

m If feasible^ assign coliiTO# is the the data oonatrix in the same order as the 
questions in the questiDiiAalsi< 

• Flan coding according %o cbioapability of ^^our equipment. If you will be 
keypunching onto IBM cards, ^^oilwill be limtt»^d to 80 columns per line. If =1 
you will be keypunching dCii^tly onto magn^stic tape or into the computer^ 
you should probably not e3tce^J132 columns pa^a line, because most printers 
which will print out your W^afile are limitsied to 132 columns per line. 

Code the Data ^ 

Coding the data is a protf^fig preparing daSCa so that someone can easily 
keypunch it. You shoald either I) a4id identification numbers and/or other numbers V^^ 
to the questionnaires so thftt thii kaypuncher ca^at keypunch directly from the 
questionnaire or 2) copy the ide^t if Nation numbers and all other data from the 
questionnaires ont6 m specially pi-ififed sheet of pvmper from which the keypuncher -i 
will keypunch the data* If the qu^fttonaire is rath^er straightforward, then the ! 
first way is certainly easier aiicl ^totably more relL_able because it eliminates the -vj 
tedious step of copying numb era by H^jiJ, On the othg^— hand, if the questionnaire is ^ 
complex and requires some thought to codi the quest io^^-^ns , then the second method may • 
be necessary. 

To include the answers to op^n^^indad questions Lz^tl the data, you must code them 
first before they can be keypunched* ^ 

When coding any data, be §ur^ to do it carefi^ally. If your concentration 
diminishes, take breaka. 



108 



lOO 



ERIC 



Cheek the Reliab ility of Ceding 



Whmnevsr thm quest iannsltas include open-endtd questians that require careful 
consideration in coding, at least two people should code some of the questionnaires 
and their codes should be comparid. If the questionnaires include only closed-tended 
questions that can be coded with little thought, then only one person needs to code 
the data» but that coder should still periodically complete spot checks of the 
coding to assure that he or she Is not making any errors. 



Ke^j^_unc_hi.ng^ the Data 

If you are not an experienced keypuneher and have a substantial amount of data, 
you should probably use a professional keypunching firm. Such firms commonly 
keypunch data ver^; rapidly and hence also inexpensively | they make far fewer errors 
than novices; and they utilise computer prograns that further reduce errors. For 
eKampla, some prog;rams cheek iach data entry to ensure that it is an acceptable 
entry for that column. 

If funds are available^ verifying the data is usually a relatively inexpensive 
option that further reduces error. During the process of verification ^ one or two 
keypunchers punch eaeh questionnaire twieej and the computer compares the two 
copies. If the^e is any discrepancy, the computer alerts the keypuneher who then 
ascertains the correct entry, 

Ixperience has clearly demonstrated that nonprofessionals make many errors - 
Thus p if you are not an experienced keypuneher^ but nevertheless must do the 
keypunching i you should definitely use some reliable method for checking errors^ 
such as checking all your work, or preferablys asking someone else to check it. 

If the data is not keypunched directly into the computer where it will be 
analyzedj then It must be put onto some kind of device so that it can be transferred 
to the desired computer. If the amount of data is relatively smallj IBM cards may 
be easier. If the amount of data is large, then magnetic tapes should be used. 
Because the specifications for these tapes differ with each computer installation, 
you should check vith the installation. 

Batting, up^eyj^nched Data on the Computer 

When you put the keypunchad data into the computer ^ the resulting file will 
automatically be a rectangular matrix of numbers , Sometimes this matrix of numbers 
can be statistically analyzed by the computer as it is. 

However^ if you used an experimental design and have both pretests and 
posttestSt then you must treat the data as data from either independent samples or 
matchrad smples . Treating the data as data from matched samples allows you to later 
condmct certain statistical tasts that are more powerful and provide more valid 
information than others that do not require matched samples. 

Indepen dent samples . If you do not have the identification numbers for each 
respondent and cannot or will not match each pretest with each posttest , then you 
must treat the data as data from Independent smples. In this case the pretests and 
posttests can be entered in any order, although often it is more convenient to enter 
all the pretests first and all the posttests second* When specifying the names of 
the variables in the SPSS file (discussed below), use the same names for both 



9 



ERIC 



pr€teit md posttest variablas. for examplt. If you lattr thm atiiiigt to question #1 
as Ql on the pretest, you should also eall it Ql on the poatteBt* t^^u must also 
have iome varlablg whleh Indieatas whether a glvttf else Is a preteit posttast. 

Ifatched samples . If you have identiflcatlom Mnbers and can mtfteh ^saeh pretest 
with eaeh posttest, then you should (but do not hava to) treat the data data from 

matehed samples. To do this, you must Inelude for isch respondent itt^t the pretest 
data and then the posttest. To group the data thie manner, you mp 1) sort the 
data in the column by indicating whether the cage li a pretest or poect^^st, 2) sort 
the data by identification number, and 3) eliminate any cases for ^hltflto you do not^ 
have both pretest and posttest data* If you have to ilminate many qs^^s^ ^ then your 
remaining sample may be biased and you should either treat the dat^ iuks data Irom 
independent samples or treat the data first as data Itom independitiC ^^amples and 
then as data from matched samples and compare the reiults of the two aA#l:»yses. 

Specify different names for the pretest and posttest vavi^b^les. for 
convenience, begin each variable en the pretest with the letter A CaI , a2 * , A3 , etc.) 
and each variable on the posttest with the letter B (Bl, B2^ b3 , ei^^c). Any 
subsequent posttests may be labeled C, D, etc. 

Creatin g an SPSS Pro^i-^file 

Although many computer software packages afe available, most r^si^^^rchers who 
analyze social science data use the Statistical Package for the io^ia ^1 Sciences 
(SPSS), and it is highly recommended. SPSS can conduct all the kinds of ^ statistical 
analyses that the user may desire^ It can handle miaiing data (which fo^ are likely 
to have), and it has excellent handbooks on its use. 

Following are several suggestions for setting up the SPSS file cafds^ t 

m Assign each variable a label similar to the questionnaire HUtfttfcers (e.g. , 
Question #1 would be Ql | Question #2 would be Q2| etc*) If ^ the file 
contains pretest and posttest data, use A1,A2| etc., and 91, etc., as 

discussed above. 

• Use the COUNT card to score knowledge tests. 

• Use the RECODE cards to reverse the order of any variable^ th^^* should be 
reversed (e.g., negative statements that are part of indices). 

• Use the COMPUTE cards to add together diffarint variables to tush^m indices. 
When doing this, be sure to divide by the nutibir of variables In » aach index 
and be sure to use the ASSIGN MISSING card to handle missing dAtA 

CJL g jtn^ijaE t^_ P at b 

Regardless of the care with which students cotfpleti queationnair^f s e=oders code 
them, and keypunchers keypunch them. All these peopls are human and Inv^r^riably make 
mistakes. Some simple coding and keypunching error i can produce l^cr^4iS.ble errors 
in conclusions. For example, consider a question Q^skiog for the numb^CK of times 
respondents had sex in the liist month. Imagine that "99" is being i^sed as the 
missing data code for a two column answer, and the ke^puncher accidents llH.y punches 
"98" instead; you might ImpT^^operly conclude that thira had been a change in 

sexual behavior. Thus, properly checking and cleaning the data Can b# extremely 



108 



important and should be completed before eondueting any serious analysis* 



Scan the Data Matrix 

Print out the data file and examine it in the following ways^ 

d Check to sea that all the cases have the correct number of lines. 

• If there are pretests and posttests for each casej be sura each case has 
the required number of tests. 

• Be sure all the lines in the files have the correct number of columns. If 
all the cases have only line, then the right hand boundary should be 
straight. If the cases have more than one line? then the right hand 
boundary should be consistent | it should form either a straight line or a 
regular pattern. 

9 Ba sure other columns appear to be aligned properly- for example, if one 
or more columns should be blank, scan down the file and be sure that they 
are in fact blank. Or, if one or more columns should have only "0"s or 
"l"s in thCT, be sure that they do in fact contain only "0"s and "l"s« 

• Check each case for large amnunts of missing data* If a case is missing 
only a small amount of dataj you should keep the case and use one of 
several useful options SPSS offers to handle missing data« If the case is 
missing a large amount of data, you should probably excluda the entire 
case. The missing data may indicate that the respondent had too little 
time, was confused^ or did not treat the questionnaire seriously; 
consequently, what data does exist may not be valid. 

• Check each case for response sets. If you find the same answer for several 
consecutive questions on a knowledge test or attitude inventory ^ you should 
consider excluding the case. The respondent probably did not read these 
questions or may not have taken the questionnaire seriously. For example, 
if a respondent answered eight successive questions on an attitude 
inventory with a "5," and if logically consistent answers would require 
some high and some low answers, then that data is probably invalid. 

Whan making decisions about whether to exclude cases because of missing data, 
response sets, or any other reason, you should always do so blindly. That is, you 
should never determine whether keeping or excluding a case will support or refute 
your theoretical conclusions and then keep or exclude the case for that reason- To 
do so would greatly bias your data and would violate basic principles of scientific 
research. On the other hand, you should not keep data or cases that are clearly 
invalid, because invalid data can also bias results. Thus^ you should exclude cases 
when the data is clearly unreliable or invalid, but your decisions should be 
detarmined entirely by the validity of the data, not by the effect the data would 
have on your conclusions. 

Examine the graquancy Pistributj^^^ 

The FREQUMCIE8 program in SPSS can provide you with the frequency distribution 
for each variable. Examine the frequencies to make sure that all the answers that 
actually occur in the FRIQUENCIES printout are raasonable numbers. If some numbers 



103 109 



Atm implauslblst look first for the Implauslbla tiuabsrs in the data file. 



• If the numbers in the data file are not plauiifele, then return to the 
original questlonnairea, find the errors and correct the data file- 

m If the original n^bers reported in the questionnaire are excessively large 
and clearly inwlld, simply declare the offending numbers as missing data« 
For esample^ if a student elalms to have had sex 100 times in the last 
week, this nimiber should be declared missing data. 

• If the numbers are correct in the data file, then the SPSS file must 
contain errors, snd you should correct these. 

m If the SPSS file is correct, and it Is Impossible to find the original 
correct numbers » simply declare the offending numbers as sdlsslng data* 

Reference 

Nle, N. j Hull, C.H. j Jenkins, J. , Bteinbrenner, K. , & Bent, D* Statistical Package 
for the Social Sciences, 2d ed* New Yorki McGraw-Hill, 19737 



EKLC 



110 

104 



CH^xm 14 

STATISTICAL AH^YSIS 



Some people are frightened by statistieSs probably because thay do not 
undarstarid them. However, there Is nothing Inherently magical^ mystical, or 
difficult about statistics. They are simply procedures to sumtaariae data, make the 
data more clear and understandable, and help answer specific questions* 

This chapter discusses the basic procedures of statistical analysis that you 
will need to analyse your data. It will assime that the computer will do most of 
the calculating* If you need to know how to calculate by hand any of these 
procedures^ read one of the statistics books referenced at the end of the chapter. 

The decision about what statistical procedures to use depends upon the type of 
data that you have and upon the hypotheses that you wish to test. There are four 
basic kinds of datai nominal^ ordinal. Interval, and ratio. 

Kinds of Data 

Nominal 

Nominal variables are characterized by* 

» mutually exclusive categories; that is, no person could fit into two or 
more categories of the same variable* 

• exhaustive categories; every person can fit into one category In each 
variable- 

m categories with assigned numbers that have no real Maning apart from their 
function as codes; a larger number does not mean more or less of something 
than a smaller number* 

The following variables are eKamples of nowdnal variables; 

Religioni l=Catholic Race: l^Whlte Sexi l=female 

2- Protestant 2=Black 2^male 

3- Jewish 3^Hispanlc 
4^Agnostlc S Atheist 4^0riental 
SNDther 5=American Indian 

Smother 

Note that you cannot meaningfully add, subtract, divide, or multiply nominal 
numbers. The average of "1" and "5" is "3," but the average of "Catholic'* and 
*'Other" is not Jewish. " 



111 

105 



ERIC 



Ordinal variables are character isad by categories that i 
m are both mutually exclusive 

d have assigned numbers whose order has real meaning | that is, the numbers 
reflect the order of some real and underlying dimension. 

in the example below of political beliefs, people who have a larger number are more 
"liberal" or "leftist" than people with a lower number. 

Political beliefs I l=radical right 

2^c©nservative 
3^iddle of the road 
4^1iberal 
S^radieal left 

Belief about birth control i Two people should definitely use birth 
control if they are having sex and do not wish to have children i 

l^strongly agree 
2^agree 
3=neutral 
4^diiagree 
S^strongly disagree 

Sexual beliefs I l^intercourse only acceptable in marriage 

2^inter course only acceptable if engaged 
3^inter course only acceptable if in love 
4^interc©urse only acceptable if there is caring 
5-lntercourse acceptable any time 



Hethodologists differ over whether or not you can add and subtract ordinal 
variables • Methodological purists claim that you do not know that distances between 
adjacent categories are equal, and thus you cannot add or subtract. Other 
methodologists claim that commonly the distances are approximately equal, that small 
errors make little difference, and that the advantages of being able to add and 
subtract are great * A reasonable criterion is the followmgi it you believe that 
you and other methodologists could reasonably consider the distances between 
adjacent categories about equal, then treat the variable as interval (see dlicussion 
below) and add and subtract them. Otherwise, you should not add or subtract them. 
The first two examples of ordinal variables above could be treated as interval 
because the categories are approximately equally far apart. The distances between 
categories in the third example are less certain, but could probably also be treated 
as interval. 

Clearly you cannot multiply and divide the categories. For example, 4 is twice 
as much as 2, but it would be meaningless to say that a liberal is twice as much as 
a conservative. 



106 



112 



Interval variables have 

• mutual Ay exclusiva and exhaustive categories 

• eatagories that have a natural order 

• categorias with meaningful distances betwee^i thra* 

For aKsmple^ temparature as maasurad by Fahrenhait or Centigrada is an interval 
variable I the diffarance between 70 and 80 dagraes is meaningfully the sama as the 
differenae between 90 and 100 degrees* 

Interval data do lend themselvas to soma aritlmietic operations. You ean add 
and subtract scores meaningfully and multiply or divide tha sua or difference of two 
or more scores. For example, you can average 70 and 80 degrees by addi^ the two 
temperatures together (70 + 80 ^ 150) and dividing the mum by two (150/2 ^ 75)* 

Thus, you can calculate means on interval data* However s you cannot 
meaningfully multiply or divide the original scores | 40 degrees is not twice as hot 
as 20 degrees . 

Surprisingly, you can assign the values 0 and 1 to a dichotomous variable and 
treat it as interval. 



O^psrticipated in the control group 
l^par tlcipated in the experimental group 

O^ever had sexual intercourse 
1-had sexual intercourse 

O^had never gone to a clinic for contraception 
l^had gone to a clinic for contraception 

O^ale 
1-f emale 



To be interval i the distances between categories must be equal; dichotomous 
variables meet this criterion because the variables have only one ''distance^" and 
thus the distances cannot be unequal. The real reason for treating dichotomous 
variables as interval is that you can then perform various important arithmetic 
operations without violating important statistical assumptions and producing 
nonsensical outcomes. 



Ratio variables have 

• mutually exclusiw^ and exhaustive categories 

• categories that have a meaningful order 

9 distances between the categories that are meaningful 

• a meaningful zero point. 



Examples i 



Ratio 




107 




The following exmplei sre ratio variablaei 



vaight 
height 

nmaber of questions answered correct ly on a knowledge test 
nuraber of hours that a sexumlity education class lasts 
ntuabar of times that respondents talked with their parents 
nimber of times that respondents had sex in the previous month 

Because these variables have a meaningful sero point, the ratio of two numbers 
is meaningful. For example, SO pounds is twice as much as 40 pounds , and eight acts 
of intercourse is four times as much as two acts. Therefore^ you can legitimately 
perform all arltlmetic operations on the acores. You can add, subtract, multiply, 
and divide them. This gives thmoL a great advantage over other kinds of variables. 



Pes cr IP t ive Statistics 



There are two basic kinds of statistics, descriptive and inferential. 
Descriptive statistics simply sumariEe and describe different properties of the 
data. To make a statement about the students who participated in a program, you 
would use only descriptive statistics. In contrast, inferential statistics help you 
make inferences or generalliations from the sample before you to some larger 
population* If you want to make a statement not only about the respondents, but 
also about similar students who might take the course, then you should use 
Inferential statistics . 



This section will discuss in order the steps that you should follow to simplify 
and sumaarlse your data. It will use the same data In each step as an example » 
Suppose that you administered a 40^item toowledge test to 25 students both before 
and after your course, and that you obtained the test scores in Figure 14^1. 



Figure 14-^1 

Know l_ed^ e_ J_0g t Scores Ptm ent_ed_ji s Or ia ina_l_ Raw Pa t a 

Pretext Score a 

31 19 24 28 14 38 2i 26 2y 12 2b 2a 3/ 

20 30 24 25 9 30 20 23 28 17 28 34 

pQ s t_t_e_s t- P^^r 

22 38 2B 34 18 3b 33 33 24 40 1^ 2/ 30 

31 40 33 13 21 35 26 38 39 20 34 35 



EKLC 



114 

108 



To create an array, simply put all the numbers in numerical order* A esmputer 
can do this for youj or you can create one yourself* The scores in the example are 
presented in Figure 14-2. 



Figure 14-2 

Pretest and ! ^ osttest Scores Ordered in Arrays 
Pre test S cores 

9 12 14 17 19 20 20 21 23 24 24 2b 2b 
26 28 28 28 28 29 30 30 31 34 37 38 

Po S t test j CO r es 

13 15 18 20 21 22 24 26 27 2s 3U 31 33 
33 33 34 34 35 35 36 38 38 39 40 40 



By simply ordering the scores in this exMple, you can see that post test scores 
are larger than the pretest scores. However ^ if you have a large number of cases , 
simply presenting the array of scores requires too much space, and you need to 
suinmarize further- 



ff r e g u en Djs t r ib u tion s 

A frequency distribution is a table that specifies the number of times that 
each number appears in the array. In Figure 14=3 are the frequency distributions 
for the scores in the example. As you can see^ they present the scores in a manner 
that is more clears understandable, and concise than the original arrays. Note that 
no information was lost by presenting the scores in this format. 



Per cental, and.Cujnu la t ive Per cent ase Pi atri^u tipng 

Often you will find it helpful to know not only how many individuals obtained 
the different scores, but also the percentage of all individuals who obtained that 
score. To create the percentage distribution, divide the frequency of each value by 
the total number of cases* Sometimes you will want to know the percentage of all 
scores of a particular si^e or smaller; to find the cumulative percentage^ add the 
percentage to all the percentages of lower scores. Figure 14-4 illustrates these 
two distributions, using data from the example* 

Cumulative percentage distributions have special importance when using 
criterion referenced methods- Cumulative percentage distributions can show the 
percentages of students who reached the criterion levels specified previously by 
experts. You can use them to compare students^ performance on the pretest and 
posttest. 



1 t ^ 

109 



EKLC 



Figure 14^3 

Knqwl edRa_ Te sj^geoy^ s^r^s anted in F r eouenc^ D is tr i but ion a 



Pretest Scoras Posttest Scores 

(N^25) <N^25) 







V^Xue 


c r eq 


Q 


1 




1 


12 


1 




1 


14 


1 




1 


17 


1 


20 


1 


19 


1 


21 


1 


20 


2 


22 


1 


21 


1 


24 


1 


23 


1 


26 


1 


24 


2 


27 


1 


25 


1 


28 


1 


26 


2 


30 


1 


28 


4 


31 


1 


29 


1 


33 


3 


30 


2 


34 


2 


31 


1 


35 


2 


34 


1 


36 


1 


37 


1 


38 


2 


38 


1 


39 


1 






40 


2 



lie 



110 



ERIC 



Figure 14-4 



Knc^irladgj^ Test Seorag Pre sent ad as Percen tage 
and Guiaulat-tve ^areantaga Diatr^uttonl 



Pratest Scoras 
(N-2 5) 

Cumulative 
Value Fra^uency Percent Percent 



9 


1 


4 


4 


12 


1 


4 


8 


14 


1 


4 


12 


17 


1 


4 


16 


19 


I 


4 


20 


20 


2 


8 


28 


21 


1 


4 


32 


23 


1 


4 


36 


24 


2 


8 


44 


25 


1 


4 


48 


£6 


2 


8 


56 


28 


4 


16 


72 


29 


1 


4 


76 


30 


2 


8 


84 


31 


1 


4 


88 


34 


1 


4 


92 


37 


1 


4 


96 


38 


1 


4 


100 



Pogttegt ScQres 
(N-2 5) 

Cumulativa 
Valua Fraquency Percent Percent 



13 




4 


4 


15 




4 


8 


18 




4 


12 


20 




4 


16 


21 




4 


20 


22 




4 


24 


24 




4 


28 


26 




4 


32 


27 




4 


36 


28 




4 


40 


30 




4 


44 


31 




4 


48 


33 


3 


12 


60 


34 


2 


8 


68 


35 


2 


8 


76 


36 


1 


4 


80 


38 


2 


8 


88 


39 


1 


4 


92 


40 


2 


8 


100 



11^ 



111 



Grouped Fraquancy Tablai 



In the aKampla above, tha tables occupy a fair smount of space because thay 
contain a large number of dlfferant values. With an avan larger number of 
catagorlas^ eiich tables would become too cumbarsoma. To overcome this probleoi, you 
can group tha dlffaranf wluas together Into classas. ^era are three steps to this 
process! 

1, Calculate the size of the range of the scoresi that is, subtract the 

smallest value from the largest* 
2* Within that range, astabllsh between 4 and 15 classes of equal siza| 

classas should be laitually exclusive and have a logical si^e* 
3m Count the number of scores In each class- 
In the axampla, scores range from 9 to 40* the size of the range Is 31- A 
class interval of 5 would produce about 7 classes* Seven Is a raasonabla number of 
classes, and a class InterTOl of 5 Is logical and easy to handle. Ihls produces the 
grouped frequency distributions In Figure 14^5* 



Figure 14--5 

Knowledge Test Scores Presanted in Grouped Frequency Distributions 



Pretest Scores Posttest Scores 

Classes Fraguancy Classes Fraquenc^ 

6-10 1 6-10 0 

11-15 2 11-15 2 

16-20 4 16-20 2 

21-25 5 21-25 3 

2 6-30 9 2 6-30 4 

31-35 2 31-35 8 

36-40 2 36-40 6 



There is no one correct grouped frequency distribution for such scores- For 
exaraplej If you want mora detail, you could use a class Interval of 3 and have more 
classes. 



Bar Graph or Histogram 

Grouped frequency distributions can easily be turned into bar graphs or 
histograms for greater visual Inpact, In general^ you may find It useful to create 
bar graphs of the most Important outcomas that you wish to emphasize* (See axample 
in Chapter 15.) 



Measures of Central Tendency 

Commonly you will find It useful to further summarlEa a set of scores by 

lis 

112 



EKLC 



calculating a measure of central tendency or average. These averages do not present 
as much Information as frequency distributions, but they are obviously very 
convenient summaries of the data. There are three different measures of central 
tendency. 

Mode . The mode Is the value with the largest frequency* In the example above^ 
the mode for the pretest is 28| for the posttest, 33. 

Note that finding the mode does not require either ordering the eases or 
performing any arlttaetic operations on the scores. ThuSj It is the only measure of 
central tendency that you can use with nominal data* You can also use it with 
ordinal^ interval, and ratio variables, although it is not as good for these kinds 
of variables as the following measures. 

Median . The median is the value of the middle score after the scores have been 
ordered* If there is an odd niHaber of scores^ then there is only one score In the 
middle and that is the median. For examples if there are 25 scores, the median is 
the value of the 13th score In the array* If there is an even number of scores, the 
median is the average of the two scores in the middle. In the eKample, the medians 
of the pretests and posttests are 26 and 33 respectively* 

Finding the median does require the scores to be In order, and thus It Is 
generally the best measure of central tendency to calculate with ordinal data* You 
can also use It with interval and ratio data, although in those cases the mean le 
nomally better* 

In the pretest scores above, if you decreased or increased any of the sc^^'^es 
below 26 without letting them exceed 26, the median would not change* Similarly, if 
you decreased or increased any of the scores above 26 without letting them fall 
below 26, the median still would not change* Thus, the median Is not a very 
sensitive measure and wastes information. Normally this Insensltlvl ty is not 
desirable. However, If you have Interval data with one or a couple of very extreme 
scores that would greatly distort the meanj the Insensitlvi ty of the median would 
make it a better measure of central tendency. 

Mean * The mean Is the average of the scores and is obtained by calculating the 
sum of the scores and dividing by the number of scores* In the examples above, the 
means of the pretests and posttests are 621/25=24.84 and 743/2 5=29.72 respectively* 

The mean does involve addltioni thus, the data must be either interval or 
ratio. For these two. It is commonly the best measure of central tendency to 
calculate- Once again, if you have ordinal variables with categories that appear to 
be equal, then you can consider calculating the mean, although the median would be a 
more rigorously correct measure. 

In contrast to the median, the mean is affected by all the scores, because in 
the first step it adds every value. Thus, It Is normally a better measure of 
central tendency than either the mode or median. However, as noted above. If an 
extreme score would greatly distort it in a misleading way, the median would be 
better. 

Using measures of central tendency in comparisons * Normally In evaluation, you 
will not be concerned with a single mode, median, or meanj rather, you will want to 
compare one mode with another, one mdian with another, etc. In the example above, 
neither the pretest nor the posttest mean Is particularly useful alone. It la the 
coi^arlson between the pretest and the posttest Mans that Is useful. 



113 



119 




Chapter 3^ which discusses experimental designs^ discusses proper methods of 
co^arltig different mans In different es^erlmantal designs- 



Measure of Dispersion 

Just as describing the central tendency of data hy finding the mode, median^ or 
mean Is commonly useful, so Is measuring the eKtent to which the scores are 
dispersed or spread apart. For example, the mean temperature In the summer may be 
80 degrees, but if the weather ranges from 60 to 110 degrees, people may be much 
less comfortable than if It ranges from 75 to 85 degrees. Similarly, if everyone in 
your class has about the same score on the test* then you can target your teaching 
to that particular level. On the other hand, if some students perform ve^ well and 
others very poorly on a pretest^ you will need to consider their varying skills. 

You can obtain some estimate of the dispersion of scores by observing either 
the frequency distribution or the grouped frequency distribution* However, 
sometimes It is more convenient to summarize the dispersion with a single number* 

Range * The range is simply the largest observed score minus the smallest 
observed score. Thus, the pretest and posttest scores have ranges of 29 and 27 
respectively. In order to find the range, you must be able to order the numbers | 
thus, the data must be either ordinal^ interval^ or ratio* 

Variance . To calculate the variance, find the distance of each score from the 
mean, square the distances, and find the mean of the squared distances. If all the 
scores are close together, the sum of the distances between each score and the mean 
will be small and the variance will be small. In contrast. If the scores are spread 
out considerably, the distances from the scores to the mean will be larger and the 
variance will be larger* Since the variance Is affected by all the scores, not only 
by the end scores. It la a more sensitive and useful measure of dispersion than the 
range* 

Variance involves adding and subtracting scores, so the data naist be interval 
or ratlo. 

Sometimes statistical formulas or computer outputs require or provide the 
standard deviation. It is simply the square root of the variance. Thus, if t.he 
variance is 4, the standard deviation Is 2, 



Inferential Statistics 

Suppose you completed a 50--ltem multiple-choice knowledge test without reading 
the questions. That is, you marked off one answer on each miltlple choice question, 
but you had no idea what the correct answer was, because you had not read the 
question. If each question had 5 possible answers, you might guess correctly about 
10 of them. If you repeated the test, you might guess correctly only 6 of them or 
15. 

This example is realistic in some ways. People do guess on tests* Sometimes 
they are lucky and guess mny questions correctlyi other times they are unlucky and 
guess most questions Incorrectly* Moreover, during actual testing of groups, mmny 
other chance factors (e.g., respondents^ having colds) affect the overall scores* 



120 

114 



ERIC 



If the mean score on your posttest is better than the mean icore on the 
pretest, you can dascribe the posttest man as hlghar than the prat eat mean. But 
descriptive statistics alone cannot tell you whether the improvement was probably 
caused by the program or by a myriad of chance factors. Howeverj you can determine 
whether chance factors or tha program probably produced your rasults by using 
inferential statistics^ and in particular^ tests of significance. 

Obtaining the follwing tests of significance Is normally easier on a computer 
than by hand. If you do wish to calculate them by hand, consult one of the 
statistics books referenced at the end of the chapter. Calculating a t--test hand 
is not very difficult If you have a small number of cases. 

Testa of Significance 

T-t ests . The t-test provides the probability that the difference between two 
means could have occurred by chance. Consequently^ whenever you want to know 
whether or not the difference between two Mans is statistically significantj you 
should conduct a t— test- 

T^tests require interval or ratio data. However^ If the data are ordlnalp have 
a small number of categories, and appear to have roughly equal intervalSi then you 
can safely conduct a t-test. Technically, the t-test also requires that the 
distribution of each sample be normal and that the two samples have similar standard 
deviations. However, both of these requirements can be violated without mxah effect 
If the sample sizes are similar or are larger than 50. Because you are likely to 
have similar sample sizes in your analyslSs you will commonly be safe using the 
t-test. 

You can use the t--test in several different experimental designs. If you have 
a pretest-posttest design, use the t-test to determine whether the difference 
between the pretest and posttest means Is significant. 

• If the pretests and posttests of each case cannot be matchedj use a 
separate or independent samples t^test. 

• If cases can be matched, use the matched pairs t^test; it is both more 
powerful and more valid- 

For a classical experimental design that Includes experimental and control 
groups 2 

m If you have Independent sample data and not matched paira data, find the 
mean improvement in the experimental group (by subtracting the pretest mean 
from the posttest mean) and then compare it with the improvement In the 
control group. 

• If you do have matched pairs data, subtract the pretest score from the 
posttest score for each individual and con^are the mean Improvement of the 
experimental group with that of the control group. 

Analysis of variance . The analysis of variance provides the probability that 
the differences among two or more means could have occurred by chance. If you have 
only two meanSj the analysis of variance will give you the same probability as the 
t^test. In facts the t^test is a special case of the analysis of variance. Because 



ERIC 



t-test progrms are easier to run^ you should use t-tests with only two means. 



If you have three or more oeanSi use the analysis of variance to determine 
whether all of them significantly differ from each other. It is especially well 
suited to comparing several different experimental groups or several different 
control groups Ce*g. , Solomon four**group deslgn). 

If you wish to observe the relative impact upon different groups of people, 
some of whom participated in the program and some of whom did not^ use two^ay 
analysis of variance,* For examples you may have given the program to half the 
freshman class^ half the sophomore classy half the junior class, and half the senior 
class. ThuSj you have an experimental group for each of the four classes and want 
to determine which class was most affected. A discussion of two^ay analysis of 
variance is beyond the scope of this handbook, but can be found in Blalock (1972)* 

Levels of Significance 

When you conduct any of the tests of significance above^ you will obtain a 
number representing the probability of the data occurring by chance. That is, th^ 
number specifies the probability that you would have obtained these scores even if 
the groups were not different. In the example used throughout this chapter, 
probability ^ .028. This means that even if the program had no impact, you would 
obtain a difference between your pretest and posttest means of at least 4.9 in 28 
times out of 1^000. Because 28 times out of 1,000 is a small number of times^ you 
can conclude that chance factors probably did not produce the difference between the 
pretest and posttest means and be right 972 times out of 1,000. 

When you obtain your probability, you should round it upward to a level of 
significance. In Inferential statistics, there are three important levels of 
significances .05, .01^ and .001. 

• If the probability you obtained Is greater than .05, state that the results 
are not statistically significant* 

• If the probability you obtained is less than or equal to .05, and greater 
than .01, state that the results are statistically significant at the *05 
level- 

• If the probability you obtained is less than or equal to .01, and greater 
than .001, state that the results are statistically significant at the .01 
level. 

• If the probability you obtained is leas than or equal to »001j state that 
the results are statistically significant at the .001 level. 

The .05 level of significance means that the probability of the your data being 
produced by chance alone is 5 chances out of 100 or less. Converse ly^ it means that 
the chances that the program or some other important factor (other than luck) 
produced the change are 95 out of 100 or more. Similarly, the .01 level of 
significance means that chance factors would produce these results only 1 time out 
of 100 or lessi the .001 level, 1 time out of 1,000 or less. Clearly, *001 is 
better than .01, and both are better than .05* 

There Is nothing magical about these particular levels of significance. They 
have simply been selected by convention- 



116 




EKLC 



Meanlngf ulness of Results 



Teats of significance can tell you whether your program probably had an Impact. 
However, they cannot tell you anything about the raagnltude or importance of that 
Impact, Particularly if your SMple sl^e Is large, the program could have a very 
small but statistically significant Ifflpact. Thus, as a ve^ important last step in 
your analysis, you should examine the magnitude of the Impact and consider i-^s 
importance* 

If you have used criterion referenced -^thods, then you should give 
considerable weight to the percentage of people that meet the desired levels and 
also to the increase In the percentage of people who meet these levels* 

If you chose not to specify levels of eo^etence, then you should look at the 
change in the mean scores. Often it is useful to view the Increase in mean scores 
as a percentage of the possible range. For example, If you use a 1^5 Llkert scale 
to measure clarity of values^ an increase from 3.0 to 4.0 would represent 20% of the 
possible range and would be substantial, whereas an increase from 3.1 to 3.2 would 
represent only 2% of the possible range and would be small. On a 50-ltem knowledge 
test, an increase from 33 to 34 would be small, whereas an Increase from 33 to 40 
would be substantial. 

When making assessments about the magnitude of the change, you should be 
realistic. Innumerable evaluations of social programs have demonstrated that 
changing people^s social skills and behavior is vei^ difficult, and in general, you 
should be pleased with small changes. Even If you are not doing a rigorous 
cost/effectiveness analysis of your program^ you should at least consider the cost 
and comprehensiveness of your program. If your program Is short, you can be pleased 
with smaller gainsj If your program Is more comprehensive, you should have higher 
expectations. 



Recommended Statistics Books 

Blalock , H,M., Jr. Social Statistics. Second Edition . New York; McGraw-Hill, 
1972. ~ ^ — ^ 

Bohrnstedt, G.W., & Knoke, D, Statistics for Social Data Analysis , Itasca, 111. i 
F.E. Peacock, 1982, 

Freeman, L.C. Elementary Applied Statistics ^ New Yorki John Wiley and Sons, 1965. 

Loether, H,J., S McTavlah, D.G, Inferential St atlstics for Soclologl s tsi An 
Introduction . Boston^ Allyn and Bacon, iy/4. ~ ____ ^ — " 

Nie, N. , Hull, C.H.^ Jenkins, J,, Steinbrenner, K. , 4 Bent, D. Statistical Package 
for the Social Sciences . New Yorks McGraw Hlllj 1975, ^ 

Siegel, S. Nonparame trie Statistics fpr the Behavioral Sciences , New Yorks 
McGraw-Hill, 1956. ^— _ 



117 



123 



au^tm 15 



The quality of writing of your report say substantially affect the extent to 
which it ia read and used* If you earefully deseribe your progr^, your research 
steps, your results, and your conclusions, your report may affect subsequent 
programs and policy. If you fail to present needed evidence or to focus your 
conelusions, your entire evaluation may have far less impact than it deserves « 

Planning the Writing Project 

Just as there are important steps in designing questionnaires , so there are 
important steps in writing a report. 

& Define your audience and consider its needs and interests, 

• Determine which points or conclusions you wish to ^phasize, and create a 
detailed outline of the report. 

• Write a draft, let it sit for several days or so, and then rewrite it. 

• Have others involved in your research and in your field review it, and 
incorporate their suggestions. 

• If feasible, have a professional editor help you organize and edit it. 

When you consider the needs of your audience, remember that some important 
members of your audience will view research methodology and statistical analysis as 
foreign languages. Thus, you must keep in mind not only their needs for the results 
(to justify funding, improve the program, etc.)^ but also their f^iliarity with 
methods and statistics. Although you may be writing the report after having been 
been immersed in the fine distinctions of statistical analysis, many of your readers 
will be put off by them. 

In some situations, you may want to summarize in lay terms the important 
findings and reconmendations at the beginning of each section and then move into 
more technical explanations. Those who do not need or want the more technical 
material can skip over it, and those who are interested can skip to the tables, 
graphs, and more technical discussions of findings. 

In other situations^ you may find it useful to write two reports: one for 
general dissemination such as in your community, and another for the professional 
literature. You do not have to satisfy all requiraments of your varied audience 
with one piece of writing. 

As you write, try to follow these general guidelines i 

• Keep both sentences and paragraphs short. 

• Use descriptive section headings • 

• Use active, not passive, verbs. (Not "questionnaires were completed by 
students"; rather i '-students completed questionnaires.") 




• Use definite, specif ie, concrete language. 

• Centimusily edit to omit needless woif ds , tighten lease aentenees, and 
ram^ite jargon* 

CQntent-8- of— the Report 

Your report should explain and discuss the following i 

• The background and purpose of the evaluation i who wanted it for what 
reasons 

• The nature of the prograini demographic characteristics of participants, 
goals, content, time period, etc. 

• The specific goals or objectives that are being evaluated 
9 The general methods used in the evaluation 

• The questionnaires used 

• The STOplei selection criteria, size, and other characteristics 

• The results both in table form and in prose 

• The limitations of the evaluation 

e The conclusions and reconamendations. 



Presenting^ Quantitative Results 

You should raphasize your most important points by creating one or more tables 
and graphs. Tables are eye-catching; graphs are even more powerful. Some people 
will read only the tables and graphs and ignore much of the text. Thus, tables and 
graphs should include your major findings and should be self eKplanatory 

Following are some guidelines for creating these tables: 

m Create a title that accurately describes the content of the table. 

• Provide pretest and postteat means to show the size of the increase or 
change . 

• Include the sample sise. 

• Include tests of signiticance if they were calculated. Follow convention 
by letting * ^ a result significant at the .05 level; ** ^ a result 
significant at the .01 level; and *** ^ a result significant at the .001 
level . 

• Rather than using numbers to signity footnotes, use letters (e.g., a, b, 
c). 

For example, the data from the previous chapter has been presented in Table 

IS^-l. 

If you have more than one group, then you need to add additional rows for those 
groups. For example. Table 15^2 is based upon different fictitious data with 
additional groups. 

Bar Sraphg 

Since bar graphs have a greater visual impact than tables, you may find it 
useful to create bar graphs of the most important outcomes. Grouped frequency 
distributions and the means and medians of different groups lend themselves well to 
this treatment . 




EKLC 



Table 15-1 

Mean Pretest and Posttest Scores on_a_ 40- It m MtiltipJ.e_ghpica_ _T^^^ 



T«test for 
Differ ance Difference 
Smple Between Between 
Groirg gj^aa Pretest Posttest Hean_8__ _ Means 



Sexuality 

Education 25 24*84 29.72 4.88 2.27** 

Class 



**Bignif ieant at the .01 level* 



^Matched pairs. 



Table 15-2 

Mean Pr e t e a t and Po a 1 1 e s t_ Sep re s on a 40 1 1 Mtyl t ijJLe Oho ic e Te s t 
fp_r__a SexuaJLity Education. jlajiA jpd^^ Its Control Group 



Gro ut 



S^ple 
Size 



Pretest 



Posttest 



Pif f erenee 

Between 

M eans 



T-test for 
Difference 
Between 
Means 



Sexuality 

EduGation 43 29*45 33.93 4.48 

Class 3 ,18** 

Control 42 28.22 31.21 2.99 

Group 



**Signif icaiit at the .01 level. 



12B 

IZl 



Following are loae guidelines for ereating these tables i 



• Cifeatt a title that accurately describes the content of the table. 

• Provide preteet and posttest means to shov the sige of the increase or 
change* 

• Include the sample eize. 

• Include tests of significance if they were calculated* Follow convention 
by letting * ^ a result significant at the .05 levell ** = a result 
significant at the *01 level | and *** = a result significant at the .001 
level . 

• Rather than using numbers to signify footnotes^ use letters Ce,g,j a^ b, 
e) . 

For^ example p the data from the previous chapter has been presented in Table 

15-1. 

If you have mmm than one groups then you need to add additional rows for those 
groups. For example^ Table 15--2 is based upon different fictitious data with 
additional groups. 



Bar jSrajJis 

Since bar graphs have a greater visual impact than tables , you may find it 
useful to create bar graphs of the most important outcomes. Grouped frequency 
distributions and the means and medians of different groups lend themselves well to 
this treatment . 

Following are several suggestions for creating bar graphs i 

• Use a separate bar (or pair of bars) for each major objective - 

• Have bars which should be compared nest to one another. For exmple^ place 
the bar representing a pretest next to the bar representing the posttest. 

« Label each axis. 

• Make intervals along the axis equal. 

• If possible* use a ser© point at the base of the axes^ if not, put a Jagged 
line through both the axis and the bars near the base. 

• Above each bar^ indicate the actual number the bar represents (e.g,| 
frequencies or means) . 

• Represent the criterion or goal of an objective by drawing a line that is 
perpendicular to the bars (see Figure 15-2), 

The frequency distributions and the means for the data from the previous 
chapter can be represented by bar graphs as in Figure 15--1 and Figure 15--2 
respectively. 



Presenting NonQuantitative Data 

In the analysis of sexuality education programs, the most common form of 
non--quantitative data is written statements. These may be unsolicited or the result 
of open-ended questions (What did you learn in this progrm? Ho^^ did this progrm 
affect you? ^at changes in the program would you recommend?), loo often such 
statements are neither fully analyzed nor reported, despite their po^ntial value. 



122 127 

o 

ERIC 



Figure 15-1 

Number of PeopJ.e_ Re£eivlj^R_Dif f €j_enJt_S_c^^^ on Pretests and Postcests 

Number of 
Stu deots 



10 

9 

a 

7 

6 
5 
4 
3 
2 
1 
0 



0 



0 



L 



la: 



6-10 11-15 16-20 21-25 26-30 31-35 36-^ Score o n Tes t 



Key I Pretest 



Posttest k \ \ \\ 



Figure 15-2 

The Hean_Kopwled^e Test 8 core a for Students on the Pretest, and Posttest 



Mean 
Score 

30 
25 
20 
13 
10 

5 

0 



29.7 



24.8 



— ^ Criterion - 28,0 



Pretest 
N ^ 25 



Posttest 
N - 25 



125 

, ; 123 

o 

ERIC 



When reporting such atatOTentSj it is extremely important to fairly represent 
negative as well as positive statesents. To report only poiltlve statement a and to 
exclude negative atatemsnts Is simply misleading and Invalid. Just as it Is 
unethical to throw out low scores froo a posttest knowledge testS| so It Is also 
unethical to fall to proportionately report negative statements* Moreoverp critical 
statements can help others recognise deficiencies In programs and then Improve them. 

An assortment of spontaneous or solicited questions can appear to defy 
reasonable organlMtion. If so, begin by writing each statement on a separate sheet 
of paper (e.g. 3 x 5 cards)| then organize the statments by content* You may then 
do one or more of the followlngi 

• Select a few representative statements to Illustrate points made in the 
quantitative analysis or in the text. 

• Include all the statements In the report, 

• Accurately summarize the statements by presenting the frequency with which 
different thenes or ideas are mentioned. 



Dilemms in Writing and Publishing the Results 

Ivaluators often do not obtain the results they hoped for. They are then faced 
with the dilemma of what to report. Shoiild they report only the positive findings * 
or should they report all findings accurate^? Especially when the findings have 
political significance^ there can be considerable pressure to emphasise only the 
positive findings , and people will often offer superficially convincing 
Justifications for doing so. However, In the long run, burying negative findings is 
not helpful to either sexuality education or evaluation research. You certainly 
have an ethical obligation to report findings accurately. Horeoverj if you find 
negative results, you should report them so that people can try to Improve programs 
and so that other findings can be trusted. 

This responsibility does not necessarily mean that you should not give any 
thought to the possible political use of your findings. People both for and against 
sexuality education may selectively quote your results^ and as much as possible, you 
should not write statements that will be greatly misleading if quoted out of 
context. In other words, each statement should be balanced as much as possible. 

The same considerations arise when you consider publishing your results. If 
those people who find that sexuality education improves behavior publish their 
results, and if those people who find that sewallty education has no impact or a 
negative impact fall to publish their findings, then the literature will obviously 
become biased and people vdLll incorrectly believe that sescuality education Is more 
effective than it actually is. 

On the other hand. If you failed to obtain positive findings and have good 
reason to believe that your findings are invalid, than you should either reevaluate 
your program before publishing the results, or you should emphasize in your report 
that your results may be invalid. 



1 On 

124 

erJc 



Suggest ed Read inas 

Bernstein J T*M. The Careful Writer i A Modern Guide to English Usaee . New York: 
Atheneum , 197/* 

A Manual of Style * 12th ed,, rev. Chieago i University of Chicago Press, 1969. 

Morris; L.L*^ B Fitg-Gibbon, C*T, How to Present an Evaluation Report t Beverly 
Hills j Calif.: Sage Publications, 1978. 

PubJ^Acation Manual of the ^erican Psyeho log leal Association , 2d ed - Washingtonj 
D.C.I Ameriean Psychological Association, 1974. 

Strunk, W., Jr., & White, E.B, Elements of Style . 2d ad. New Yorki MacMillan, 
1972. 



130 



125 



EKLC 



CEAFTESL 16 
EVALUATING SPECIIIC KINDS OF PROOTAHS 



Although the Mthods dascrlbad In this volume are generally applicable to many 
kinds of sexuality education programs, some programs have special characteristics 
that affect their evaluation. This chapter discusses some of the special 
considerations for evaluating specific kinds of programs. 

Comprehensive jrograms Lasting About a Semester 

The methods already described in this volume are well suited to evaluating 
comprehensive programs- Because most coi^rehenslve programs last numerous weeks , 
using an experimental design with a control group is especially importanti during 
the elapsed time* students' Imowledge, attitudeSp and behavior may change even if 
they do not take the program. Comprehensive programs are also more likely than 
shorter programs to have a variety of effects upon knowledges attitudesj and 
behavior* Thus, It is especially important that you carefully specify objectives 
and measure most or all of the possible outcomes* 

At the end of a semester course, students may trust the teacher much more and 
may be much more open about their se^aiality than at the beginning of the semester. 
Thus, they may answer questions about attitudes and behavior more honestly at the 
end of the semester than at the beginning. To reduce this possible bias, you can 
use the first week of the course to increase trust and openness, but not teach much 
about sexuality, and then administer the pretests during the second week. 

Short Stjructured Courses Lasting 1 or 2 Weeks 

Programs that last a relatively short time are likely to have less Impact than 
programs lasting a semester or longer. Thus, there are fewer plausible outcomes 
that you should measure and the questionnaires should be shorter. For example, 
short courses are less likely to have an intact upon self esteem and there is less 
need to measure Self esteem. 

Programs that last 1 or 2 weeks are unlikely to produce imch behavioral change 
during the course but my produce considerable behavlpral change after the course. 
Thus, you should be sure to administer second posttests weeks or months after the 
end of the course- 



One-day Conferences 

Because l^day conferences are so short, they require a modified experimental 
design. Many desired beha^oral outcomes of the co.iference cannot take place during 
that day and need not be measured at the end of the conference* Obviously, for 
example, there is no need to measure the amount of unprotected sexual activity twice 



127 13 j; 



ERIC 



on the day of the eonferencei Thui, postteata measuring behavior should be 
adiBinlstered weeks or months later. On the other hand| both knowledge and attitudes 
may ehange during the day and can profitably be meaaured at both the beginning and 
the end of the conference. 

The short duration of the conference has a second Impact upon the methodolo^. 
At the beginning of the day, participants are usually fresh and relatively willing 
to complete questionnaires carefully. By the end of the dayi they are likely to be 
tlredj to have less energy and eoneentratlon, and to be less willing to complete 
lengthy questionnaires carefully. Thua^ If you do administer questionnaires at the 
end of the dayi you should make the questionnaires as short and easy as possible. 

Because the short posttests administered at the end of the conference are not 
sufficient to measure change In many Important outcomes, administering 
queatlonnalres at a later time becomes especially important. Doing this usually 
requires obtaining the participants^ names and addresses and making some 
arrangements for mailing them the questionnaires. Especially when the 
questionnaires contain sensitive questions about sexuality, follow these proceduresi 

• Carefully explain to the participants the Importance of their completing 
the second posttests at home. 

• Obtain their permission to send the questionnaires to their homes. 

• Strongly encourage thra to complete the questionnaires anonymously without 
anyone else^s help or advice. 

• Encourage participants to return the questionnaires, even If they decide 
not to complete them, so that questionnaires with sensitive questions do 
not circulate throughout the community. 

Many conferences are voluntary, and some participants may leave the conference 
before It ends. Participants^ leaving before completing the posttest questionnaires 
may affect your results negatively. First, it will reduce your sample size. 
Second, the participants who leave may be different In some way (In Intelligence, 
motivation, or satisfaction ^rtth the conference) from those who stay. Thus^ their 
loss may bias your sample and affect Its representativeness- This possible bias 
might render a con^arleon of the mean scores on the pretest and posttest Invalid- 
To prevent such a bias, use Identification numbers and match pretests with 
posttests. 



Similar biases may occur when you obtain delayed posttest datai less motivated 
people may refrain from returning questionnaires. Once again, you can reduce this 
problem by using Identification numbers and matching pretests with the delayed 
posttests. 

Because 1-day conferences are shorty they probably will have less Impact than 
longer, more comprehensive programs. Thus, you need not measure many of the 
outcomes that you might wish to measure in longer programs. For example, you 
probably need not measure changes in self esteem or skills, because conferences are 
not likely to have an Impact upon these outcomes. 

Some conferences have fleKlble, unstructured formats. That Is, participants 
can attend different activities, peruse materials on their own, ask counselors or 
other professionals questions during small group discussions, etc# Thus, 
participants may be less likely to learn a specific set of facts, but more lltely to 



128 



132 



EKLC 



learn partiGUlsr factual information that is immediately relevant to them. Thus, 
traditional knowledge testa may be a poor method of asieising knowledge gain. 
Unfortunately, there are not many good al ternativei . You can include open-ended 
queations that ask participants 1) to summarize what they have learned or 2) to 
specify several factual pieces of information that they learned, but such questions 
usually fall to accurately assess how much the respondents learnedj how much they 
actually knowj and what topics should be covered more fully, 

Feer Education Prp^rms 

leer education programs are particularly difficult to evaluate because their 
effects are likely to be small and diffuse* That is, peer educators are likely to 
interact with only a small number of students who are scattered throughout the 
student body* Never theless , there are at least two potentially successful 
strategies for evaluation. 

Firsts you can collect data on the entire school body during the years both 
before and after the peer education program is implemented. This data can be 
questionnaire data collected from the students, pregnancy data coliected from 
clinics and doctors ^ or other kinds of data* A comparison of the before and after 
data will give some evidence for the effects of the program. 

Second s if the peer educators speak before selected classes of students in the 
school, and if they subsequently meet primarily with students in these classes, then 
you can randomly assign classes to experimental and control groups* The classes to 
which the peer educators speak would comprise the experimental group; the classes to 
which they don'^t speak, the control group* You can then administer questionnaires 
to both groups of classes at the beginning^ and end of the year and compare the 
changes over time* 

Questionnaires should include questions on the number of contacts that the 
students had with the peer educators. If some students had no contact with the peer 
educators, then changes in their knowledge or attitudes could not have been produced 
by interaction with the peer educators* You should, however, use this line of 
reasoning with caution* Do not compare students who seek information from the peer 
educators with students who do not, because students who meet with the peer 
educators may be more likely to be sexually active and in need of information than 
those who do not seek information. 

Your questionnaires should be quite short because the small amount of 
interaction with the peer educators is not likely to produce a substantial amount of 
change and you do not need to measure as many outcomes* 

In generalj all analyses of peer education programs should be viewed with 
caution I 

• The effects of interaction with peer educators are small and diffuse. 

• Assessing the amount of interaction between each student and the peer 
educators is difficult. 

• Students who seek advice from peer educators are likely to be different 
from those who don't* 



EKLC 



133 

129 



Parant/Chlld Progr^g 



Ffogramg offered to paifents and theiif ehildren together can ba avaluatad in 
mUGh the lame way that other programs have been evaluated. However, sueh prograns 
offer an additional possibility of matching the parents with their children, asking 
slMlar questions of eachs and then co^arlng answrs* For example, to measure the 
Impact of the program on family communication about se^alityp you can ask both 
parents and their children how often they talk abmit sewallty and how comfortable 
they are during those conversations. (To date, however^ researchers hav© found very 
little relationship between the reports of the parents and the reports of their 
chlldren» ) 

A few researchers have tried to measure the Impact of parent courses upon the 
quality of family communication by video or audio taping f ami ly conversation about 
sexuality both before and after the program. Although intriguing, we don^t 
recommend the method. 

• The presence of the tape recorder and the instructions for the session 
prevent people from talking In a normal manner. People find taping far 
more Intimidating than completing questionnaires. 

• For some families^ the Instructions to talk about seKuallty encourages or 
forces them to do something they have never done before and may therefore 
give an Inaccurate plctwe of family communication. 

• Many families do not even talk about seKuallty during their preprogram tape 
recordings, rendering those tapes In^^lld. 

m Implementing the recording sessions Is time consuming to both the 
participants and the researchers because the recordings cannot be completed 
simultaneously in a single group. 

• Coding tape recordings la very difficult* 



Conclusions 

We have been evaluating sexuality education programs for several years and have 
learned a great deal from our e^erlenees. At the beginning we certainly made our 
share of mi stake si we designed questionnaires that were too difficult and too longi 
we included a few questions that were too sensitive for some people | we tried to 
measure too many outcomes| we sometimes failed to obtain data from control groups; 
we sometimes failed to ensure the proper administration of questionnaires! we tried 
a few new, innovative, and totally unworkable approaches. However, we learned^ 
continually improved our methods, and collected very useful data on the effects of 
programs . 

We have written this volume so that you can learn from some of our experiences 
and avoid some of our mistakes- No single volume can present all that you need to 
know to conduct valid evaluations. However, If you follow the principles and 
methods described In this volume you are likely to obtain valid and very useful 
information about the success of your program. You can then Improve or expand your 
program. 



130 

EKLC 



Rimember the following point si 

• Specify clearly the most important outaomea of the program that you wlih to 
measurei Reoognlie that you probably cannot measure validly all important 
outcemesi he realistic about what you can measure* Be carefiU. about asking 
sensitive questiona* 

• Obtain approval from the school or organisational authoritlaa, parents^ 
partlclpante, and other appropriata groups. Be able to justify why you^re 
asking each question* 

• Usa multiple methods as much as possiblai design dlffarant kinds of 
questlonnairas s giva them to dlffarant groups of people, and use other 
methods as wall- 

• Usa as many characteristics of eKperimental designs as possible. At a 
minimum collect pratast and posttest data. 

m Rapaatedly pratast your quastlonnalres and be sura thay are reliable and 
valid for your particular raspondants. 

• Make sure the administration of the quastlonnalres is rlgorous| if students 
fail to treat the questionnaires seriously, the most carefully daslgned 
evaluation can be useless* 

• Be especially careful to maintain the anonymity or confidentiality of any 
potentially sensitive data. 

• Be prepared to learn that your evaluation indicates your program is not as 
effective as you had hoped. 

• At every stage guard against letting your hopes and values bias your 
a valuation. 

Increasing numbers of people ara evaluating thair programs with these mathods, 
improving their programs, and finding that the combination of evaluation and program 
Improvament Is well worth the effort. 

If you have conducted few or no evaluations before, you may feel Intimidated by 
all the methods described in this volume. If so» remember some of the suggestions 
given for making the procass easleri 



• Start with a small and ralatlvely simple evaluation^ and as you become mora 
familiar with evaluation methods, improve the size and quality of your 
evaluation. 

9 Contact ma thodologlcal consultants or other members In the field who have 
previously completed evaluations j ask them questions^ and laarn from their 
experiences. 

• At first use quastlonnalres and other materials developed by others| la'tar 
develop your own questionnaires. 



Remember^ both evaluation methods and statistics may appear difficulty but they 
are a logical set of procaduras to collect data systematically and make that data 
mora clear and concise so that important questions can be batter answered. In many 




ERLC 



respects learning to use thass methods Is like learning to ride a bicycle* the going 
Is difficult at flrstp but becomes much easier with practice. Just as falling down 
makes the principles of bike riding more ob^ous* so will amklng mistakes with these 
methods make their rationale more ob'sd.ous. 



136 

132 



APPENDIX 
QimSTIOimAXRBS 



These quastionnalres are TOdifled versions of the questionnaires that we have 
used to evaluate sexuality education programs for adolescents. They Include 
evaluations and assessments of the course to adialnlster at the end of the course and 
quastionnaires which measure knwledge, attitude^ and behavior to administer before 
and after the course. See the guidelines belo^^* 



Questionnaire 



Administration 



Knowledge Questionnaife 
Attitude and Value Inventory 
Behavior Inventory 

Knowledge^ Attitude^ and Behavior Inventory 
(An integrated^ condensed version of the 
first three) 

Course Evaluation 
Assessmnt of Course Irnpact 

Course Assessment for Parents 



Before and after the course 
Before and after the course 
Before and after the course 
Before and after the course 

After the course 
After the course 

After the course 



These questionnaires provide examples of questions that you can use* You 
should modify the questionnaires to meet the values of your community, the 
particular goals of your programs and the characteristics of your program 
participants- For example^ if the adolescents in your program are not likely to be 
sexually active, then you should remove those questions dealing with sexual 
activity* You should also consider the appropriate length of each questionnaire. 
If the questionnaire is too long, remove questions or use the Knowledge, Attitude, 
and Behavior Inventory which contains the most important questions from the 
Knowledge Questionnaire, the Attitude and Value Inventory, and the Behavior 
Inventory # 



The Knowledge Questionnaire^ Attitude and Value Inventory, and Behavior 
Inventory can be subdivided into the following individual scales. The questions on 
the Knowledge Questionnaire and the Behavior Inventory can be analyzed separately or 
as scales. Although the scales include questions measuring the same topics, they 
are not true multi^item scales. In contrast, the scales in the Attitude and Value 
Inventory are true multl--item scales* Thus^ if you intend to measure a particular 
attitudej you should include all the questions of that scale. That is» you should 
not use the questions Individually. 

For reasons of space^ we have listed here only the Item numbers for each scale. 
Because the Attitude and Value Inventory contains true multi^item scales* you may 
prefer to read the items grouped as scales rather than randomly ordered through the 




EKLC 



Invantory. F©r your eon^^nience in doing so, we hava grouped all the Itemfl by scale 
at tha end of that questlonnal»* 



Scalaa in thm Knwledge Questionnaire 
Physical Development 
Adolescent Relationships 
Adolescent Seiwal Activity 
Adolescent Pregnancy 
Adolescent Marriage 
Probability of Pregnancy 
Birth Control 

Sexually Transioltted Disease 



Quagtion Numbers 
2, 8, 13, 15, 25, 28 
22, 27, 29 
1, 3, 16, 17 

6, 20, 23 
9, 30 

S, 10, 12, 19 

4, 11, 18, 26, 31, 32, 34 

7, 14, 21, 24, 33 



Scales in the Attitude and Value Inventory 
Clarity of Long Term Goals 
Clarity of Personal Sexual Values 
Understanding of Emotional Needs 
Understanding of Personal Social Beha^or 
Understanding of Personal SeKual Response 
Attitude Toward Various Gender Role Beha%rtors 
Attitude Toward Sexuality In Life 
Attitude Toward the Importance of Birth Control 

Attitude Toward Premarital Intercourse 

Attitude Toward the Use of Pressure and Force 
in Sexual Activity 

Recognition of the Importance of the FaoLlly 

Self Esteem 

Satisfaction with Personal Sesniality 
Satisfaction with Social Relationships 



Question Numbers 

10, 23, 30, 37, 51 

5, 13, 25, 49, 70 
14, 17, 48, 56, 62 

6, 19, 27, 34, 66 
21, 31, 36, 45, 52 
8, 28, 41, 50, 65 
12, 42, 55, 58, 64 
4, 16, 40, 59, 61 

% 20, 22, 29, 63 
15, 46, 47, 54 

11, 24, 53, 60, 69 
3, 26, 35, 44, 68 

7, 18, 33, 39, 57 
1, 32, 38, 43, 67 



Seales in_tlie_ Behavior Javentorjg QamBtion Numbers 

Taking Responsibility for Behavior 1, 2 

Decisionmaking Skills 3, 4, 5, 6 

Decisionmaking Skills about Semal Behavior 7, 8, 9, 10 ^ 11 

CoTOunication Skills 12, 13, 14, 15, 16, 17, 18, 19 

Assertiveness Skills about Sexual Behavior 20, 21, 22, 23, 24 

Comfort with Social Interaction 25, 26, 27, 28 

Comfort Talking about Bex and Birth Control 29, 30, 31, 32, 33, 34, 36 

Comfort Talking about Sexuality with Parents 31, 34 

Comfort Talking about Sexuality with Friends 29, 32 

Comfort Talking about SeKuality with Girl or 30, 33 
Boyfriend 

Comfort Expressing Concern and Caring 35 

Comfort Being Assertive Sexually 36 * 37 

Comfort with Current Sex Life 38 

Comfort Getting and Using Birth Control 39, 41, 42 

Sexual Activity 43, 44, 45 

Use of Birth Control 46, 47, 48 

Frequency of CoMsunication about Sex and 49, 52 
Birth Control with Parents 

Frequency of Communication about Sex and 50, 53 
Birth Control with Friends 

Frequency of Communication about Sex and 51, 54 
Birth Control with Boyfriend or Girlfriend 



135 



139 



ERLC 



We are tryAng to find out If this program Is successful* You can help us by 
completing this questionnaire - 

To keep your answers confidential and private, do NOT put your name anywhera on this 
questionnaire. Please use a regular pan or pencil so that all questionnaires will 
look about the same and no one will know which is yours - 

Because this study Is Important, your answers are also Important. Please answer 
each question carefully - 

Thank you for your help* 

Name of school or organization 
where course was taken: 



Teacher'^s name* 



Your birth datei Month ^ Day 



Your sex (Check one)i Male Female 



Your grade level in school (Check one) : 9 

10 
11 
12 



137 140 



ia circle the one best answer to each of the questions below. 



By the time teenagers graduate from high schools in the United States: 

a. only a few have had sex (sewal intercourse)* 

b* about half have had se%> 
c« about 80% have had sex* 

During their nenstrual periods , glrlsi 

a. are too weak to participate in sports or exercise. 

b. have a nomal^ monthly release of blood from the uterus* 
c« cannot possibly become pregnant. 

d. should not shower or bathe. 
e# all of the above. 

It is harmful for a mman to have sex (sexual intercourse) when she: 

a. is pregnant. 

b. is menstruating. 
c« has a eold« 

d. has a sexual partner with syphilis. 

e. none of the above. 

Borne contraceptives: 

a. can be obtained only with a doctor's prescription. 

b. are available at faMly planning clinics. 

c. can be bought over the counter at drug stores. 

d* can be obtained by people under 18 irtthout their parents'' permission, 
e. all of the above. 

If 10 couples have sexual intercourse regularly without using any kind of 
birth control, the number of couples who become pregnant by the end of 1 
year Is abouti 

a. one. 

b. three- 

c. six. 

d. nine. 

e« none of the above - 

When unmarried teenage girls learn they are pregnant, the largest group 
of them decide: 

a# to have an abortion. 

b. to put the child up for adoption. 

c# to raise the child at home. 

d. to iMrry and raise the child ^th the husband* 

e. none of the above. 




138 



7. People having sa^aial iritereourse ean best prevent getting a sessually 
transmitted disease (VD or STD) by uslngi 

a* edndoma (rubbaifs)* 
b* eontraeeptlve fe^* 
e. the pill* 

d* withdrawal (pulling out)* 
8- When boys go through puberty i 



a* they lose thalr "baby fat" and become slimmer. 

b. their penlses become larger* 

c# they produce sperm# 

d- their voices become lower. 

e» all of the above* 



9. Harried teenagers s 

a. have the same social lives as their unmarried friends. 

b- avoid pressure from friends and family* 

c* still fit In easily with their old friends, 

d. usually support theMelves without help from their parents* 

e* none of the above - 



10- If a couple has sexual intercourse and uses no birth control^ the woman 
might get pregnant: 



a. any time during the month- 

b* only 1 week before menstruation begins. 

c* only during menstruation. 

d* only 1 week after menstruation begins - 

e. only 2 weeks after menstruation begins. 



11- The method of birth control which is least effective iss 

a. a condom with foam- 

b. the diaphragm with spermicidal Jelly, 
c- withdrawal (pulling out). 

d. the pill. 

e. abstinence (not having intercourse). 



12. It is possible for a woman to become pregnant: 

a* the first time she has seK (sexual intercourse)* 

b. if she has sexual Intercourse during her menstrual period. 

c. if she has sexual intercourse standing up. 

d. if sperm get near the opening of the vagina, even though the man^s 
penis does not enter her body. 

e. all of the above. 

13. Physically J 



a. girls usually mature earlier than boys- 

b. most boys mature earlier than most girls. 

c. all boys and girls are fully mature by age 16. 
d* all boys and girls are fully mature by age 18. 



139 142 

EKLC 



14. 



It Is impossible now to curet 



a. syphilis. 

b. g&norrhea. 
c* harpes virus 
d. vaginitis • 

a. all of the above. 



15. When men and woTOn are physically mtura: 

a. each female ovary relaasea two eggs eaeh month. 

b. each fesaale owry releases millions of eggs each month. 

c. male testes produce one sperm for each ejaculation (cllmaK)* 

d. male testes produce millions of sperm for each ejaculation (climax), 
e» none of the above. 

16. Teenagers who choose to have sexual Intercourse may possibly i 

a. have to deal with a pregnancy. 

b. feel guilty. 

c. become more close to their sexual partners. 

d. become less close to their sexual partners. 

e. all of the above. 

17. As they enter puberty, teenagers become more interasted in sexual 
activities because: 

a. their sex hormones are changing* 

b* the Mdla (TV, movies, magaglnes, records) push sex for teenagers* 

c. soma of their friends have sex and expect them to have sax also. 

d. all of the above. 

18. To use a condom the correct ways a person ust: 

a. leave some space at the tip for the guy^s fluid. 

b. use a new one eve^ time sexual Intercourse occurs. 

c. hold it on the penis while pulling out of the vagina. 

d. all of the above. 

19. The proportion of American girls who become pregnant before turning 
20 is: 

a. 1 out of 3. 

b. 1 out of 11. 
c» 1 out of 43. 
d. 1 out of 90. 

20. In general, children born to young teenage parents: 

a. have few problems because their parents are emotionally mature- 

b. have a greater chance of being abused by their parents. 

c. have normal birth weight. 

d. have a greater chance of being healthy, 
a. none of the above. 



140 



if3 



EKLC 



21. 



Tifaatinent Eor veneraal d.lseasa Is best ifi 



a« both partners are treated at the same time* 

b. only the partner with the sy^toms sees a doetor* 

Cm the person takes the medielne only until the symptoms disappear. 

d« the partners continue ha^ng ssk (sexual Intercourse)* 

e# all of the above* 

22. Most teenagers I 

a« have crushes or Infatuations that last a short time, 

b. feel shy or awkrard when first dating, 

c* feel Jealous sometimes- 

d. worry a lot about their looks. 

e. all of the above* 

23* Most unmarried girls who have children while still in high school? 

a* depend upon their parents for support. 

b- finish high school and graduate with their class* 

c* never have to be on public welfare. 

d. have the same social lives as their peers. 

e. all of the above. 

24. Syphlllsi 

a* Is one of the most dangerous of the venereal diseases. 

b. is known to cause blindnesSi Insanity, and death If untreated. 

c* is first detected as a chancre sore on the genitals- 

d. all of the above* 

2 5* For a boys nocturnal emissions (wet dreams) means he I 

a. has a semial illness* 

b- is fully mature physically - 

c* Is experiencing a normal part of growing up- 

d. is different from most other boys- 

26. If people have sexual Intercourse, the advantage of using condoms is that 
they s 

a. help prevent getting or giving VD. 

b. can be bought in drug stores by either sax. 
c* do not have dangerous side effects. 

d. do not require a prescription. 

e. all of the above- 

27. If two people want to have a close relationship. It is important 
that theys 

a* trust each other and are honest and open with each other. 

b. date other people. 

c. always think of the other person first* 

d. always think of their own needs first* 
e* all of the above* 



141 



144 



28. 



The phy Ileal ehanges of puberty i 



a- happen in a weak or two* 

b. happen to different teenagers at different ages, 

a* happen quickly for girls and slowly for boys, 

d* happen qulekly for boys and slowly for girls* 

29. For most teenagers, their emotions (feelings)! 

a- are pretty stable. 

b. seem to change frequently. 

G. don^t eoncern them very much. 

d. are easy to put Into words. 

e. are ruled by their thlri^lng. 

30. Teenagers who marry ^ compared to those who do noti 

a. are equally likely to finish high sehool. 

b« are equally likely to have children. 

c. are equally likely to get divorced. 

d. are equally likely to have successful work careers. 

e. none of the above. 

31. The rhythm method (natural family planning) i 

a. means couples gatmot have Intercourie during certain days of the 
woman'^s menstrual cycle. 

b. requires the woMn to keep a record of when she has her period* 

c. Is effective less than 80% of the time. 
d* is recommended by the Catholic church, 
e. all of the above. 

32. The pilli 

a. can be used by any woman. 

b. Is a good birth control method for women who smoke . 
c- usually makes menstrual cramping worse. 

d. must be taken for 21 or 28 days in order to be effective. 

e. all of the above. 

33* Gonorrheas 

a. is 10 times more common than syphilis, 

b. Is a disease that can be passed from mothers to their children during 
birth. 

c. makes many men and women sterile (unable to have babies), 
d* is often difficult to detect in wonen. 

e* all of the above. 

34. People choosing a birth control method i 

a. should think only about the cost of the method. 
b« should choose whatever method their friends are using, 
c* should learn about all the methods before choosing the one that^s 
best for them. 

d. should get the method that's easiest to get, 

e. all of the above. 




142 



ERIC 



Answers to the Knowledge Quest.tQnnalra 



Que at ion An g war Question Answer 



1 


b 


18 


d 


2 


b 


19 


a 


3 


d 


20 


b 


4 


e 


21 


a 


5 


d 


22 


• 


6 


a 


23 


a 


7 


a 


24 


d 


8 


e 


25 


c 


9 


« 


26 


a 


10 


a 


27 


a 


11 


e 


28 


b 


12 


e 


29 


b 


13 


• 


30 


e 


14 


c 


31 


a 


15 


d 


32 


d 


16 


a 


33 


a 


17 


d 


34 


c 



lie 



143 



ERIC 



We are trying to find out if this program is succesiful. You can help us by 
completing this questionnaire* 

To keep your answers confidential and private, do NOT put your name anywhere on this 
questionnaire* Please use a regular pen or pencil so that all questionnaires will 
look about the same and no one will know which is yours* 

Because this study is important, your answers are also important. Please answer 
each question carefully - 

Thank you for your help. 



Name of school or organization 
where course was taken ^ 



Teacher's name: 



Your birth datei Month _______ Day 

Your seK (Check one)i Hale Female 

Your grade level in school (Check one) t 9 

10 
11 
12 



EKLC 



147 

145 



The questions below are not a test of how mueh you know. We are interested in what 
you believe about some impoftant issues, flease rate each statement aecording to 
how mueh you agree or disagree with it. Iveryome will have different answers. Your 
answer is correct if it describes you very well- 



Cirelei 1 ^ if you Strongly Disagree with the statement. 

2 ^ if you Somewhat Disagree with the statCTient* 

3 ^ if you feel Neutral about the statement. 

4 ^ if you Somewhat Agree with the statraient. 

5 ^ if you Strongly Agree with the statement. 



1 • I M very happy with my friendships . 

2. Unmarried people should not have sex (sexual 
intercourse) . 

3. Overall, 1 am satisfied with myself. 

4. Two people having sex should use some form of birth 
control if they aren't ready for a child. 

5. I'm confused about my personal sexual values and 
beliefs • 

6. I often find myself acting in ways I don't 
understand . 

7. 1 am not happy with my sex life. 

8. Men should not hold jobs traditionally held by women, 

9* People should never take "no" for an answer when 
they want to have sex, 

10. 1 don't know what I want out of life. 

11. Families do very little for their children. 

12. Sexual relationships create more problems than 
they're worth. 

13. I'm confused about what I should and should 
not do sexually. 

14. I know what I want and need emotionally, 

15. No one should pressure another person into sexual 
activity . 























u 


u 














m 






m 




m 


d? 








u 


M 














Q 
























m 








m 






m 


•i 


M 


■1 


m 


o 








Q 


u 


M 




e 


N 




a 




Q 


4j 










W3 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 
1 




J 


H 


5 


1 




s 


A 
H 


5 


1 


2 


3 


4 


5 


1 




3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 



148 

146 







M 

-H 


Disagree 




Agree 


Agree 






gj 


w 

f8 

M 

I 

B 

Q 


rH 

cd 
M 




Q 






Q 
M 
4J 


4-i 


e 

o 


M 
4y 
W 


16. 


Birth control is not very important. 


1 


2 


3 


4 


5 


17. 


I know what I need to be happy. 


1 


2 


3 


4 


5 


18. 


I am not satisfied with my seKual behavior (sex life) . 


1 


2 


3 


4 


5 


19 . 


I usually understand the way I aet . 


1 


2 


3 


4 


5 


20. 


People should not have sex before marriage* 


1 


2 


3 


4 


5 


21. 


I do not know much about my own physical and 
emotional sexual response* 


1 


2 


3 


4 


5 


22. 


It is all right for two people to have sex before 
marriage if they are in love* 


1 


2 


3 


4 


5 


23. 


I have a good idea of where I'^m headed in the 
future * 


1 


2 


3 


4 


5 


24* 


Fmiily relationships are not important. 


1 


2 


3 


4 


5 


25. 


1 have trouble knowing what my beliefs and values 
are about my personal sexual behavior* 


1 


2 


3 


4 


5 


26 * 


I feel I do not have much to be proud of* 


1 


2 


3 


4 


5 


27 , 


1 understand how 1 behave around others* 


1 


2 


3 


4 


5 


28, 


Women should behave differently from men most of the 
time • 


1 


2 


3 


4 


5 


29 - 


People should have sex only if they are married. 


1 


2 


3 


4 


5 


30 * 


1 know what I want out of life* 


1 


2 


3 


4 


5 


31 , 


I have a good understanding of my own sexual 
feelings and reactions. 


1 


2 


3 


4 


5 


32, 


I don't have enough friends - 


1 


2 


3 


4 


5 


33. 


l^m happy with my sexual behavior now* 


1 


2 


3 


4 


5 


34. 


1 don't understand why I behave with my friends 
as X do* 


1 


2 


3 


4 


5 


35* 


At times I think 1% no good at all. 


1 


2 


3 


4 


5 



ERIC 



1 know how I react in different sexual situations, 

I have a clear picture of what I'd like to be doing 
in the future. 

My friendships are not as good as I would like them 
to be • 

Sexually ^ I feel like a failure. 

More people should be aware of the importance of 
birth control* 

At work and at homei women should not have to behave 
differently from men, when they are equally capable* 

Sexual relationships make life too difficult, 

I wish my friendships were better, 

I feel that 1 have many good personal qualities* 

I am confused about my reactions in seKual 
situations • 

It is all right to pressure someone into sexual 
activi ty * 

People should not pressure others to have sex with 
them. 

Most of the time my emotional feelings are clear 
to me , 

I have my own set of rules to guide my sexual 
behavior (sex life). 

Women and men should be able to have the same jobs, 
when they are equally capable* 

I don't know what my long-range goals are. 

When I'm in a sexual situation, I get confused about 
my feelings, 

Fmilies are very important* 

• 150 

148 







Lsagree 






Agree 


09 








s 








50 

O 


m 

1 


M 




iH 

O 






U 




o 


o 
w 


U 


54. 


It is all right to demand sex from € |lrlfriend 
or boyfriend. 


1 


a 


3 


4 


5 


55. 


A sexuml relationship is ooa of the biit things a 
person can have* 


1 


2 


3 


4 


5 


56. 


Host of the time I have a clear UQdetitanding of 
my feelings and raotions. 


1 


a 


3 


4 


5 


D/ • 


I am very satisfied with my sexual activities just 
the way they are- 


1 


2 


3 


4 


5 


58 . 


Sexual relationships only bring troublito people. 


1 


2 


3 


4 


5 


59. 


Birth eontrol is not as important as mm people say. 


1 


2 


3 


4 


5 


60 . 


F^ily relationships cause more trouble than they re 
worth. 


1 


2 


3 


4 


5 


61. 


If two people have sex and aren'^t ready to have a 
childj it is very important that they^ yie birth 
control c 


1 


2 


3 


4 


5 


62. 


I'^m confuted about whmt I ne#4 eRioti^nally * 


1 


2 


3 


4 


5 


63. 


It is all right for two people to h^yt iix before 
marriage . 


1 


2 


3 


4 


5 


64. 


Sexual relationships provide an impot-taiit and 
fulfilling part of life. 


1 


2 


3 


4 


5 


65 * 


People should not be expected to btna:i/ain certain 
ways just because they are male or f^iU, 


1 


2 


3 


4 


5 


66* 


Most of the time I know why I behav^ the way I do. 


1 


2 


3 


4 


5 


67. 


I feel good having as many friends aa I have. 


1 


2 


3 


4 


5 


68« 


I wish I had more respect for myself, 


1 


2 


3 


4 


5 


69, 


Family relationships can be very vgtu^fele. 


1 


2 


3 


4 


5 


70. 


I know for sure what is right and vrosi| lexually 
for me* 


1 


2 


3 


4 


5 



iK^■■yv::;:^^^^^^ ... . .... IBl 

ERIC 



Clarity of Ic^n p Term Goals 

10. I dou^t knew what X vaQt out of ll£e» 

23. X hav^e good Idea of where I'm headed in the future. 

30, I know w^hat I want out of life. 

37i I have clear picture of what I'd llKe to be doing in the future* 

5U I know ^mh&t my long range goals are. 

Glari tv of J.^^rsonal Sescual Value a 

5f I'm cof^^Sused about my personal seKual values and beliefs. 
13i I'm co^^Sused about what X should and ibpuld not do seKuall^, 
25. I have t^youble knowing what my beliefs ind values are about my personal 
sexual bso ehavior • 

49. I have ti^^y own set of rules to guide ay sexual behavior (sesc life) . 
70i X know ^Sor sura what is right and wrosg sexually for me. 

UMer stand i j3^^ of ET notional Heeds 



14i X knov ^i^^hat 1 want and need emotional l^i 

l?i I knov iii^hat X need to be happy. 

48i Host o£ the time my emotional feelings ara clear to me- 

5^1 tost ©£ the time I have a clear understanding of my feelings and emotions « 

62( I'm cotif^used about what I need emotionally. 

Understanding- ^ of Persoiial Social Behavior 

6i I ofteo find myself acting In ways I don't understands 

19* I usualL_y understand the way I act. 

27. I unders^^tand how X behave around otherSi 

34, I don't understand why I behave with my friends as I do. 

66i Most o£ the time I know why I behave the way I do. 

Ujiirjtandjng^ : of Personal, Sexual Response 

21. I do not know much about my own phyilcal and emotional sexuml response. 

31j X have a good understanding of my own saKual feelings and t^mctions. 

36. I know h-o-^ow X react In different seKual situations, 

43. I am cotL^fused about my reactions In saxuAl situations. 

52. When I'bl^ in a sexual situation^ I get confused about my feeXlnga, 

Attitude lo^^^d Various__Gender_ RcLl^Jeh^ 

8i Men shoul Id not hold jobs traditionally held by women* 
28i WomeD sh»ould behave differently from men most of the time. 

41i At wOfk ^^nd at home^ women should not have to behave differently than men, 
when the^y are equally capable* 




ISO 



152 



so. Women and men ehould be able to have the same JobSi when they are equa 
eapable* 

65. People should not be expected to behave in certain ways just beeause 
they are male qt fraale. 

Attitude Toward Sexuality in Life 

12- Sexual relationships create more problems than they're worth. 
42, Sexual relationshipa maka life too diffieult* 

55. A sexual relationship is one of the best things a person can have. 
58* Sexual relationships only bring trouble to people. 

64. Sexual relationships provide an important and fulfilling part of life* 

Attitude Toward the Jmj_ortan_ee _of Birth Control 

4* Two people having sex should use some form of birth control , if they 
aren'^t ready for a child. 
16* Birth control is not very important* 

40, More people should be aware of the importance of birth control. 
59. Birth control is not as important as some people say. 
61* If two people hve sex and aren't ready to have a child, it is very 
important that they use birth control. 

At tit ud e Toward Pr emar i t a_l_ Jn ter cdur s e 

^ 2* Umiarried people should not have sex, 
20. People should not have sex before marriage, 

22, It is all right for two people to have sex before marriage if they are 
in love* 

29, People should have sex only if they are married, 

63, It is all right for two people to have sex before marriage, 

Attitudf Toward the Use of. Pressure and_FQr€^_in Sexual Activitv 

9* People should never take "no" for an answer when they want to have sex 

15, No one should pressure another person into sexual activity. 

46, It is all right to pressure someone into sexual activity, 

47. People should not pressure others to have sex with them, 

54* It is all right to demand sex from a girlfriend or boyfriend* 

RecoRniti^on. of the Importance of the Fmilv 

11, Fmilies do very little for their children, 

24, Family relationships are not important, 

53, Families are very important, 

60* Family relationships cause more trouble than they're worth. 

69* Family relationshlpe can be very valuable* 



453 



Self Esteem 



3. Overall p I ma. eatisfled with myself. 

26 » I feel I do not have much to be proud of. 

35 « At time X think I'm no good at all. 

44. I feel that I have many good personal qualities* 

68. 1 wish I had more respect for myself - 

Batlsf actioj with Personal^ J[exu^_ij^ 

7. I am not happy with my sex life. 

IS. X am not satisfied with my sexual behavior (sex life). 

33. I'm happy with my sexual behavior now. 

39. Sexually I feel like a failure. 

57. I am very satisfied with my se^mal activities just the way they are 

Satisfaction with Soeial Relationships 

1 • I am very happy with my friendships . 

32, I don't have enough friends. 

3§. Hy friendships are not as good as X would like them to be. 

43. I wish my friendships were better. 

67. X feel good having as many friends as X have. 



154 

152 



We are trying to find out if this program is suc^tei^ ssf ii 1 • lfo\s ean help us by 
completing this questt^nnalre. 

To keep youif answers eenfidential and private, do NOT ^«*iit your ngffle anywhere on this 
questionnaire* Bleast use a regular pen or pencil so ^=^at all questionnaires will 
look about the §Bme And no one will know which is you^s^^ , 

Because this study is ittportant, your answers are alfi^^ important* Please answer 
each question carefull^i 

Thank you for your help* 



Name of school or OTgaclzation 
where course was taken i 



Teacher ' s name i 



Your birth date i Month Day 
Your sex (Check one)i Male Female 



Your grade level in scbool (Check one) * 9 

10 
11 
12 



155 



Part 1, 



The queitiotts below ask how often you have A^^mm some thiags. Sottfi of the questions 
are personal aftd ask about your social life an^^ se^ llfii Some qut^^tions will not 
apply to you. Please do not eonelude fro© «the questions that yait^ should have had 
all of the eKp^flenees the questions ask about.. ^ Instead, just muK ^T^hatever answer 
describes you bgat. 

Circle! 1 « if you do it Almost Neveri which ffi^^ans about 51 of th^ t Ime or leas. 

2 ^ if you do it Sometimes, which mea^sB about 251 of the titL-^^* 

3 s if do it Half the Time, whioh paeans about 501 ©f tbi time. 

4 ^ if you do it Usually, which means ^^bout 751 of the tmm^ 

5 ^ if you do it Almost Always, whieh i^eans about 95% of thi time or more^ 
DNA ^ if the question Does Not Apply to ^p^ou. 



Mi ^ 

fl) .H mm* 

> ^ ^ < 

B M O 

(/} AJ ! ^ {Q 

1 I S ^ 8 S 

1* When things you^ve done turn out poorly, l^ow 1 2 3 5 DNA 

often do you take responsibility for your 
behavior asd its consequences? 

2. When things you've done turn out poorly, l^mow 1 2 3 5 dhA 
often do you blame others t 

3, When you are faced with a decision^ how e^^ten 1 2 3 5 DNA 
do you take responsibility for making a d^^cision 

about it? 

4, When you have to make a decision, how ofct-^n do 12 3 5 dhA 
you think hard about the consequences of ^^aeh 

possible choice? 

5. When you have to make a decision, how ofttr-m do 12 3 5 DNA 
you get as Auch information as you can b&t - ore 

making the decision? 

6* l^ien you have to make a decision, how Of tt^nm do 12 3 4i» 5 DNA 
you first discuss it with others? 

7- When you hmve to make a decision about you^^ 1 2 3 5 DNA 

sexual behavior (for example, going out on a 
datf, holding hands, kissing, petting, or ^baving 
seK) , how often do you take responsibiHty for 
the consequences? 

8. When you have to make a decision about yo'^^m I 2 3 4— 5 DNA 

saKual behavior^ how often do you think hfiHcd 
about the ^ansequences of each possible ehcslee? 



154 



EKLC 



9 . When you have to aaki a decision about your 
sexual behavior J how often do you first get 
am mueh iuf ornatson as you can? 

10* ^en you have Co makg a dacisioa about your 
sexual behaviof i how often do you first 
discuss tt with otei? 

11 • When you have to like a deelslon about your 
sexual behaviofj how often do you make it on 
the spot vithout i/orrying about the consequences? 

12* When a friend wasti to talk with yoUp how often 
are you able to cliar your mind and really 
listen to what yoor friend has to say? 

13. When a friend is Ulking* with you, how often 
do you ask quegtloni if you don^'t understand 
what your friend is saying? 

14. When a friend is talking with yoUp how often 

do you nod your bead and say "yes" or something 
else to show that you are interested? 

15. When you want to talk with a friend^ how often 
are you able to get your friend to really listen 
to you? 

16. When you talk with i friends how often do you ask 
for your friend^s react ion to what you've said? 

17. When you talk with a friend , how often do you 
let your feelings ihow? 

18. When you are with a friend you care about, how 
often do you let that friend know you care? 

19 • When you talk wit li a friend , how often do you 
include statement fi like feelinge are... 5" 

"the way X think la,,!," ©f "it seems to me"? 

20. When you are aloticwith a date or boy/girlfriend, 
how often can yovi tell him/her your feelings 
about what you w^tit to do and do not want to do 
sexually? (If yo^art a bey, boy/girlfriend means 
girlfriendi if yoiilrt a girli it means boyfriend.) 











m 








§ 
















































m 










m 




















m 
























m 




1 








m 








m 


0 






m 






m 


1 


2 


3 


4 


5 


DNA 



2 3 



DHA 



2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 

2 3 4 5 DNA 



W7 



If t bof/girl put^* pressure on you to be in* 
miv^md B^mMy a^m.d you dan^t want to be 
iayo^lv^di how oft^^m do you say "no"! (If you 
ate b^y j bey/gii^l means girll if you are m 
it Oflians be^-.) 

If a boy/girl put^s pressure on you to be in— 
VDlv^^d p^imally a^^d you don't want to be 
iuvo^lv^di how oft^»s do you sueceed in 
stopping at! 

Z^, If you bawHiXuaL^ intercourse with your 
boy/^gic^lf ^ieadi often can you talk witb 

him/^^er about birt li eontrol? 

If yc^u ha^i laxuaL^ Intercourse and want to us# 
btrt^^ QOHtfOl, ho\amr often do you iniist on uelmi 
blrt^3 Qoncrol? 



158 

156 



I ar t 2> 



In thi^^ lection, we want to know how uncomfortable you are doing different things. 
Being "^^^ncomf or table" meani that it is difficult for you and it makai you nervous 
and up-"— tight. For eseh itm, eircle the number that deseribes you best, but if the 
Item do*--^8n't apply to you, circle DNA. 

Circl0i 1 ^ if you are Crafortable. ^ 

2 ^ if you are A Little Uncomfortable. ^ ^ 

3 = if you are Somewhat Uncomfortable. u m 

4 - if you are Very Uncomfortable. o o ^ 
D:I^A ^ if the question Does Not ^ply to you, g g ^ ^ 

QUO ^ 
^ d MH P. 

rH P ^ S < 

^ o 

^ ^ 4J CJ U 

^ rH Cd C* O 

© 4J ^ 

■ hj e M ^ 

d o m o 

Q ^ > Q 



25. 


de^-tting together with a group of friends of the 

f .^^E^^L — ^ ^ * 


1 


2 


3 


4 


DNA 


26. 


(Sowing to a party. 


1 


2 


3 


4 


DNA 


27. 


Ta^3.king with teenagers of the opposite sex. 


1 


2 


3 


4 


DNA 


28. 


Coding out on a date. 


1 


2 


3 


4 


DNA 


29. 


talTXking with friends about sex. 


1 


2 


3 


4 


DNA 


30. 


ta^iking with a date or boy/girlfriend about sex. 
(l^t you are a boy, boy/girlfriend means girlfriend; 
if you are a girlj it means boyfriend,) 


1 


2 


3 


4 


DNA 


31. 


^a^^king with parents about sex. 


1 


2 


3 


4 


DNA 


32, 


ra^^king with friends about birth control. 


1 


2 


3 


4 


DNA 


33. 


Ta^aking with a date or boy/girlfriend about birth 
co^^trol, (If you are a boy, boy/girlfriend means 
gia^lfriendi if you are a girlj it means boyfriend.) 


1 


2 


3 


4 


DNA 


34. 


Ta^tking witb parents about birth control. 


1 


2 


3 


4 


DNA 


35. 


E^c^^ressing concern and caring for others. 


1 


2 


3 


4 


DNA 


36. 


raM^ling a date or boy/girlfriend what you want to 
do and do not want to do sexually. 


1 


2 


3 


4 


DNA 


37. 


Sajp^ing "no" to a sexual come^on. 


1 


2 


3 


4 


DNA 


38. 


Ha^p^ing your current sex life^ whatever it may be 
(it^ may he doing nothing, kissing, petting, or 
hAVp^ing intercourse) • 


1 


2 


3 


4 


DNA 



Igg-^^^^^^^^^^ ,; Ma 

ERIC 



If you are not having sexual intercourse, circle DNA in the four questions below 



39, Insisting on using some form of birth control * if 
you are having sex. 

40* Buying contraceptives at a drug storey if you are 
having sex* 

41. Going to a doctor or clinic for contraception^ if 
you are having sex. 

42* Using some form of birth control ^ if you are having 
sex* 

Part 3 * 

Circle the correct answer to the following two questions^ 

43* Have you ever had sex (sexual intercourse)? yes no 

44* Have you had sex (sexual intercourse) during the last month? yes no 





u 


0) 




























m 














m 






M 


u 








Q 


a 












m 






S 


a 




>^ 




a 


o 


M 


fH 




u 


u 


Q 


CL 




m 


d 












m 










s 




cd 






0 






iH 


m 


m 


O 


U 


4J 






m 


Q 


4=1 










•H 








B 


n4 


1 


M 




O 




Q 




o 




=^ 








1 


2 


3 


4 


DHA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 



160 



158 



Part 4, 



The following questions ask how many times you did some things during the last 
month* Put a number in the right hand spaee to show the number of times you engaged 



in that aetivity* 
space. 



If you did not do that during the last month, put a "0" in the 



Think CAREFULLY about the times that you have had se^ during the last month. Think 
also about the number of times you did not use birth control and the number of times 
you used different types of birth control. 



45. 



e sex 



Last months how many times did you hay 
( s exu a 1 in t er eour s e ) ? 



Last month I how many times did you have sex when 
you or your partner did not use any form of birth 
control? 



times in the 
last month 

times in the 
last month 



47. Last month, how many times did you have sex when 
you or your partner used a diaphragm, withdrawal 
(pulling out before releasing fluid), i^hythm 
(not having sex on fertile days), or foam without 
condoms ? 



times in the 
last month 



48 • Last month, how many times did you have sex when you 
or your partner used the pill, condoms (rubbers), 
or an lUD? 



times in the 
last month 



(If you add your answers to questions #tt , #47, and #48, the total should equal your 
answer to #45. If it does not, please correct your answers.) 



49* During the last month, how many times have you had 
a conversation or discussion about sex with your 
parents? 

50. During the last month, how many times have you had 
a conversation or discussion about sex with your 
friends? 



times in the 
last month 



times In the 
last month 



51 • During the last month, how many times have you 
had a conversation or discussion about sex with 
a date or boy/girlfriend? (If you are a boy, 
boy/girlfriend means girlfriendi if you are a girl, 
it means boyfriend.) 

52. During the last month, how many times have you had 
a conversation or discussion about birth control 
with your parents? 

53, During the last month, how many times have you had 
a conversation or discussion about birth control 
with your friends? 



times in the 
last month 



times in the 
last month 



times in the 
last month 



1 61 

erJc 



54. During the last month, how many times have you had 
a eonversation or discus sion about birth control 
with a date or boy/girlfriend? 



Thank you for completing the questionnaire. 



162 

160 



Wa are trying to find out if this program is sueeessful. You can help us by 
eompleting this questionnaira* 

To keep your answers confidential and privatej do NOT put your name anywhere on this 
questionnaire. Please use a regular pen or peneil so that all questionnaires will 
look about the same and no one will imow which is yours. 

Because this study is important j your answers are also important. Please answer 
eaeh question carefully. 

Thank you for your help . 



Nrae of school or organisation 
where course was taken i ^ 



Teacher ^ s name i 



Your birth date I Month Day . . 

Your sex (Check one) i Hale ____ Female 



Your grade level in school (Check one) ^ 9 

10 
11 
12 



163 

161 



EKLC 



Part 1 , 

Circle the one best answer to each of the questions below« 

1. Soma eo&traceptlves I 

a* can be obtained only with a doctor'^s prescription- 

b. are available at fuaily planning clinics. 

c. can be bought over the counter at drug stores. 

d. can be obtained bf people under 18 without their parents'" permission. 

e. all of the above. 

2. If 10 couples have seKual intercourse regularly without using any kind of 
birth control p the nuaber of couples who become pregnant by the end of 1 
year is about i 

a • one . 

b, three. 

c» six. 

d- nina. 

e. none of the above. 

3. People having sexual intercourse can best prevent getting a sescually 
transmitted disease (VD or BTD) by using i 

a. condoms (rubbers). 

b. contraceptive foam. 

c. the pill. 

d. withdrawal (pulling out). 

4. If a couple has sexual intercourse and uses no birth control , the woman 
might get pregnant i 

a, any time during the month. 

b. only 1 week before menstruation begins, 
c* only during menstruation. 

d. only 1 week afCer menstruation begins. 

e. only 2 weeks after menstruation begins, 

5. The method of birth control which Is least effective is: 

a. a condom with foam. 

b. the diaphragm with spermicidal Jelly. 

c. withdrawal (pulling out). 

d. the pill. 

e. abstinence (not having intercourse) . 

6. It is possible for a woman to become pregnant i 

a. the first time she has sexual Intercourse. 

b« if she has sexual Intercourse during her menstrual period. 

c. if she has sexual Intercourse standing up. 

d. if sperm get near the opening of the vagina, even though the man^s 
penis does not enter her body. 

e. all of the above* 



164 

162 



7, 



In general, children born to young teenage parents: 



a« have few problCTS because their parents are emotionally mature. 

bm have a greater chance of being abused by their parents* 

c« have normal birth weight* 

d. have a greater chance of being healthy* 
a* none of the above* 

8, If people have se^al intercourses the advantage of using condoms is that 
they : 

a. help prevent getting or giving VD. 

b, can be bought in drug stores by either sex* 
Cm do not have dangerous side effects* 

d« do not require a prescription* 

e. all of the above. 

9- Most unmarried girls who have children while still in high school: 

a. depend upon their parents for support* 

b* finish high school and graduate with their class* 

c* never have to be on public welfare* 

d. have the same social lives as their peers* 

e* all of the above. 

10. People choosing a birth control method i 

a. should think only about the cost of the method* 

b. should choose whatever method their friends are using* 

c. should learn about all the methods before choosing the one that's best 
for them. 

d. should get the method that's easiest to get- 
e* all of t'ue above. 



: 265 

163 



ERIC 



£§££2. 



This part is MOT a knowledge test* We are interested in what you believe about some 
important issues* Please rate each 8tat@aent aecording to how mueh you agree or 
disagree with it, Iveryone will have different answers. Your answer is correct if 
it describes you very well. 



Circle 1 1 = if you Strongly Disagree with the statraent. 

2 ^ if you Somewhat Disagrea with the statemeQt. 


u 


Disagree 




0) 


u 




3 ^ if you feel Neutral about the statement , 

4 ^ if you Somewhat Agree with the statement* 


m 




93 

as 


0) 

< 




5 s if strongly Agree with the statement. 




m 

i 

Q 




4 








m 
o 


u 


as 

e 






N 

4-J 


m 
m 


a 


o 

N 


11. 


Uimarried people should not have sex* 


J 


2 


3 


A 




12. 


I have my own set of rules to guide my sexual 
behavior (sex life). 


1 


2 


3 


4 


5 


13. 


Birth control is not very important* 


1 


2 


3 


4 


5 


14. 


People should not have sex before marriage* 


1 


2 


3 


4 


5 




X know for sure what is right and wrong sexually 
for me* 


1 


2 


3 


4 


5 


16 


sic^i^n wont^Qj. am uqu &b xmpornanu as some peopjLe say* 


1 


2 


3 


4 


5 


17. 


I have trouble knowing what my values are about my 


1 


2 


3 


4 


5 


18, 


More people should be aware of the importanee of 


1 


2 


3 


4 


5 


19. 


People should have sex only if they are marrled* 


1 


2 


3 


4 


5 


20 


X m confused about my personal sexual values and 
beliefs * 


1 


2 


3 


4 


5 


21. 


Two people having sex should use some form of birth 
control if they aren'^t ready for a child* 


1 


2 


3 


4 


5 


22. 


It is all right for two people to have sex before 
marriage if they are in love. 


1 


2 


3 


4 


5 


23. 


I'm confused about what 1 should and should not do 
sexually • 


1 


2 


3 


4 


5 


24. 


If two people have sex and aren't ready to have a 
babyi it is very important that they use birth 
control. 


1 


2 


3 


4 


5 


25. 


It is all right for two people to have sex before 


1 


2 


3 


4 


5 



mairiage, 



166 

164 



Part 3, 



The following parts ask questions that are personal and ask about your social life 
and eex life. Some questions will not apply to you* Please do not conclude from 
these questions that you shQuld have had all of the eKperiencea the questions aek 
about* Instead, just mark whatever answer describes you best. 

In this section, we want to know how uncomfortable you are doing different things* 
Being "uncomfortable" means that it is difficult for you and you feel nervous and 
uptight* 

Circle: 1 ^ if you are Comfortable* 

2 ^ if you are A Little Uncomfortable. 

3 ^ if you are Somewhat Uncomfortable. 

4 ^ if you are Very Uncomfortable. 
DKA — if the question Does Not Apply to you. 



26. Talking with friends about sex. 

27* Talking with your boy/girlfriend about sex. 

("boy/girlfriend" means "boyfriend" if you are a 
girl, and it means "girlfriend" if you are a boy*) 

28. Talking with parents about sex. 

29. Talking with friends about birth control- 

30* Talking with your boy/girlfriend about birth control* 

31. Talking with parents about birth control. 

32. Having your current sax life, whatever it may be 
(it may be doing nothing, kissing, petting, or 
having intercourse) * 



































m 


m 


















N 


U 








Q 


o 












m 






g 


S 




>, 




O 


O 


u 






a 


U 


o 




m 


d 


m 












m 


t 








o 




m 


m 




y 








m 


e 


Q 


U 

a 




4 


m 


m 




'H 


m 

m 


>s 


m 


S 




U 


m 


Q 




o 




Q 








M 




1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 


1 


2 


3 


4 


DNA 



If you are not having sexual intercourse, circle DNA in the three questions below- 

33, Buying contraceptives at a drug store, if you are 12 3 4 DNA 
having sex. 

34. Going to a doctor or clinic for contraception, if 12 3 4 DNA 
you are having sex* 

35* Using birth control, if you are having sex* 12 3 4 DNA 



mc 




The questions below ask how often you do some thli^s. 

Circle I 1 ^ if you do it Almost Never, whioh means about 5% of the time or less* 

2 » if you do it Sometimes, which means about 25% of the time. 

3 ^ if you do it Half the Time^ which means about 50% of the time* 

4 ^ if you do it Usually, which means about 75% of the time. 

5 ^ if you do it Almost Always, which means about 95% of the time or more. 
DNA ^ if the question Domm Not Apply to you. 













A" 














u 




1 
















m 




n 




























m 




< 






i 








Q 












m 










n 












o 


m 


1 


% 




s 


B 






Q 


« 






o 










a 




1 


2 


3 


4 


5 


DNA 



36* When you have to make a decision about your 

sexual behavior (holding hands, kissing^ petting, 
or having sex), how often do you think hard about 
the consequences of each possible alternative? 

37. When you have to make a decision about your 12 3 4 5 DNA 
sexual behavior, how often do you first get 

as much information as you can? 

38. When you have to make a decision about your 12 3 4 5 DNA 
sexual behavior, how often do you first 

discuss it with other people? 

39. l^en you have to make a decision about your 12 3 4 5 DNA 
sexual behavior, how often do you make it on 

the spot without thinking about the consequences? 

40. If a boy/girl puts pressure on you to be 12 3 4 5 DNA 
involved sexually and you don^t want to 

be involved, how often do you stop him/her? 

41 . If you have sexual intercourse with your 12 3 4 5 DNA 
boy/girlfriend, how often can you talk with 

him/her about using birth control? 

Part 5 . 

Circle the correct answer to the following two questions. 

42. Have you ever had sexual intercourse? yes no 
43 # Have you had sexual intercourse during the last month? yes no 



1GB 



Par t 6 > 



The following questions ask about aetivitiee during the last month. Put a number in 
the right hand space which shows the number of times you engaged in that activity. 
Put a "0" in that apace if you did not engage in that activity during the last 
month • 

Think GAREPtJLLY about the times that you have had sax during the last month. Think 
also about the mmbrnv of times you did not use birth control and the number of times 
you used different types of birth control. 



44. Last month, how many times did you have seRual 
intercourse? 



timigs in the 
last ^onth 



45 • Last monthj how many times did you have B€K when 
you or your partner did not use any form of birth 
control? 



times in the 
last month 



46. Last month, how many times did you have sex when 
you or your partner used a diaphragm, 
withdrawal (pulling out before releasing fluid), 
rhythm (not having sex on fertile days), or fow 
without condoms? 



times in the 
last month 



47 . Last month, how many times did you have sex when 
you or your partner used the pill, condoms 
(rubbers), or an lUD? 



times in the 
last month 



(if you add your answers to questions #45, #46, and #47, the total should equal your 
answer to #44. If it does not, please correct your answers.) 



48. During the last month, how many times have you had a 

conversation or discussion about sex with your parents? 



times in the 
last month 



49. During the last month , how many times have you had a 

conversation or discussion about sex with your friends? 



times in the 
last month 



50. During the last month, how many times have you had a 
conversation or discussion about sex with a date or 
boy/girlfriend? (If you are a boy, boy/girlfriend 
means girlfriendi if you are a girl, it means 
boyfriend *) 



times in the 
last month 



51 • During the last month, how many times have you had a 
conversation or discussion about birth control with 
your parents? 



times in the 
last month 



52. During the last month, how many times have you had a 
conversation or discussion about birth control with 
your friends? 



times in the 
last month 



53. During the last month, how many times have you had a 
conversation or discussion about birth control with a 
date or boy/girlfriend? 



times in the 
last month 



Thank you for completing the questionnaire. 



I67IS9 



ERIC 



COroSE EVALtrATIOH 



We are trying to find out if this program is successful. You can help us by 
completing this questionnaire. 

To keep your answers eonfidential and private i do NOT put your name anywhere on this 
questionnaire. Please use a regular pen or pencil so that all questionnaires will 
look about the BBmm and no one will know which is yours. 

Beeause this study is important, your answers are also important. Please answer 
each question carefully. 

Thank you for your help. 



Nme 6£ school or organization 
where course was takeni ____ 



Teacher's name: 



Your birth date: Month Day ^ 

Your sex (Check one) : Male ______ Female 

Your grade level in school (Cheek one) : 9 

10 
11 
12 



170 

169 



Part 1. 



Below is a list of questiens about your teacher. Now that this elasi is Qvar» 
please answer ^ach question by circling one number based upon this S'-point scales 

1 ^ Not at All 

2 ^ A Small toount 1 J 1 S 

3 s A Hedim ^ount S 3 ■ J ° 

4 ^ A Large ^ount ^ ^ f ai m 

5 ^ A Great Deal U ^ ^ u u 

1 * Wai the teacher enthusiastic about teaching this 1 2 3 4 5 
course? 













m 




m 


Q 




O 

J 


M 












i 














m 


1 






















1 


2 


3 



2. 


Was the teacher unoomtor table discussing different 
things about sex? 


1 


2 


3 


4 


5 


3* 


Did the teacher discuss topics in a way that made 
students feel uncomfortable? 


1 


2 


% 

ij 


A 




4. 


Did the teacher talk at a level that the students 
could understand? 


1 


2 


3 


4 


5 


5. 


Did the teacher care about the students? 


1 


2 


3 


4 


5 


6. 


Did the teacher show respect toward the students? 


1 


2 


3 


4 


5 


7. 


Did the students trust the teacher? 


1 


2 


3 


4 


5 


8. 


Did the teacher get along with the students? 


1 


2 


3 


4 


5 


9, 


Did the teacher encourage students to talk about 
their feelings and opinions? 


1 


2 


3 


4 


5 


10* 


Did the teacher talk too much about what^s right and 
wrong? 


1 


2 


3 


4 


5 


11* 


Did the teacher listen carefully to the students? 


1 


2 


3 


4 


5 


12. 


Did the teacher discourage students from hurting 
others in sexual situations (such as knowingly 
spreading VD or forcing someone to have sex}? 


1 


2 


3 


4 


5 


13* 


Did the teacher encourage students to think about 
the consequences before having sexual relations? 


1 


2 


3 


4 


5 


14. 


Did the teacher encourage students to think about 
their own values about sexuality? 


1 


2 


3 


4 


5 


15. 


Did the teacher encourage the use of birth control 
to avoid an unwanted pregnancy? 


1 


2 


3 


4 


5 


16. 


Did the teacher encourage students to talk with 
their parents about sexuality? 


1 


2 


3 


4 


5 



I 



ERIC 



Part 2 



Below is a list of questions about you and the course. Continue to answer eaeh 
question by eircliag one number based upon the same S^point scale i 



1 ^ Not at All 

2 ^ A Small ^ount u e 

3 ^ A Medium ^ount § 1 i 

4 ^ A Large toount ^ § J ^ ^ 



4J 



5 ^ A Great Deal 3 S 

4J fH *H tia 

B m m u 

4^ c/3 a ,j O 

Q 

m < ^ - ^ < 

17 « Were you bored by the course? 1 2 3 4 5 

18, Did students participate in class discussions? 12 3 4 5 

19* Were you encouraged to ask any questions you had 12 3 4 5 

about sex? 

20. Was it hard for you to talk about your own thoughts 12 3 4 5 
and feelings? 

21 . Was it hard for you to ask questions and talk about 1 2 3 4 5 
sexual topics? 

22. Did you show concern for the other students in the 12 3 4 5 
class? 

23. Did the other students show concern for you? 12 3 4 5 

24. Were students' opinions kept confidential (not spread 12 3 4 5 
outside the classroom)? 

25. Were you permitted to have values or opinions that 12 3 4 5 
were different from others in the class? 



• 171 ^ 

ERIC 



Part 3 . 



These five queitioiis should be answered using another 5-point seale* Circle the 
number that best describes your ©pinion^ but if you doa^t know, circle DK, 

1 ^ Very Poor 

2 ^ Poor 

3 ^ Average ^ 

4 ^ Good o 

5 - Exeellent £ 
DK ^ Don^t Know ^ 

26, What is your opinion of the teacher? 1 

27 * What is your opinion of the topics covered in 1 
the course? 

28, What is your opinion of the materials usedj 1 2 3 4 5 DK 
such as books and films? > 

29. What is your opinion of the organisation and 1 2 3 4 5 DK 
format of the program^ such as lengthy location, 
and time? 

30, What is your opinion of the overall program? 1 2 3 4 5 DK 

31. What things about the program did you particularly like? 









4J 










m 


i 








OJ 










fH 






m 




fH 




u 


u 


•a 


m 




o 




o 


u 




o 


1 




M 


Q 






a 


m 




2 


3 


4 


5 


DK 


2 


3 


4 


5 


DK 



32. What things about the program do you think should be changed? How? 



Thank you for completing the questionnaire. 

173 

172 



ASSES^imr Of cotose wact 



We are trying to find out if this program is successful. You can help us by 
completing this questionnaire. 

To keep your answers confidential and private, do NOT put your name anywhere on this 
questionnaire. Please use a regular pen or pencil so that all questionnaires will 
look about the same and no one will know which is yours* 

Because this study is important^ your answers are also important. Please answer 
each question carefully. 

Thank you for your help. 



of school or organization 
where course was taken ^ 



Teacher ' s name i 



Your birth datei Month Day 

Your sex (Cheek one) ' Hale Female 



Your grade level in school (Cheek one) i 9 

10 
11 
12 



EKLC 



174 

173 



03 






i 








u 






o 


























O 






















* 


u 




a 










3 


4 


5 


3 


4 


5 


3 


4 


5 


3 


4 


5 



Bifectionsi Now that this sexuality education course is ovetj we would like to know 
how it ©ay have changed you, if at all* Please answer eaeh question by circling the 
mmber that best describes how you have changed because of this, course . 

Circles 1 ^ Much Less 

2 ^ Somewhat Less 

3 ^ About the Sme 

4 ^ Somewhat More m 

5 ^ Much More 3 

to 

1. Do you know less or more about sexuality? 1 2 

2, Do you understanding yourself and your behavior less 1 2 
or more? 

3# Are your attitudes and values about your own sexual 1 2 

behavior less or more clear? 

4* Do you now feel that using birth control when people 1 2 
are not ready to have children is less or more 
important? 

5. Do you talk about sexuality (going out, having sex, 12 3 4 5 
birth control, or male and fraale sex roles) with your 
friends less or more? 

6. Do you talk about sexuality with your boy/girlfriend 12 3 4 5 
less or more? 

7. Do you talk about sexuality with your parents less 12 3 4 5 
or more? 

8- When you talk about sexuality with others (such as 12 3 4 5 

your friends, boy/girlfriend, and parents) are you 
less or more comt or table? 

9, Do you talk about sexuality less or more effectively 12 3 4 5 

(that is, are you less or more able to talk about your 
thoughts^ feelings, and needs and to listen carefully)? 

10. Are you less or more likely to have sex? 12 3 4 5 

11* If you have sex, would you be less or more likely to 12 3 4 5 
use birth control? 

12. If you have sex* would y©u be less or more comtortable 1 2 3 4 5 
using birth control? 

13. Do you respect yourself less or more? 12 3 4 5 

175 



174 



14. yeu less or more satisfied with your social life? 1 2 3 4 5 

15, Are you less or mora satisfied with your sex life 1 2 3 4 5 
whatever it may be (it may be doing nothing , kissing, 

petting, or having sex)? 





m 


1 






m 






















m 
















m 


m 
















3 




•s 


M 






i 


u 


i 


o 






Q 




o 




tn 






1 


2 


3 


4 


1 


2 


3 


4 



Q 



Part 2 , 

We are still interested in knowing about any ways you may have changed because of 
A^hjs eourAg ■ Please answar the following questions by eireling the number that 
describes you besti 

1 ^ Much Worse 

2 = Somewhat Worse 

3 ^ About the Sme 

4 ^ Somewhat Better m m m 

-i ^ g *J 

5 ^ Much Better u m 

0) ^ mm 

M 4J ^ W *i 
D ^ 4J qj 

^ -e ^ ^ 

3 4J 3 

j3 gj ^ fli 

y e o S u 

a p ^ o p 

^ tn m 

16* Do you now make worse or better decisions about your 1 2 3 4 5 

social life? 

17 • Do you now make worse or better decisions about your 12 3 4 5 

physical sexual behavior? 

18- Do you now get along with your friends worse or better? 12 3 4 5 



Thank you for completing the questionnaire. 



176 

175 



ERIC 



We are trying to find out if this program is successful* You can help us by 
completing this questionnaire « 

To keep your answers confidential and private, do NOT put your nane anywhere on this 
questionnaire « Please use a regular pen or pencil so that all questionnaires will 
look about the same and no one will know which is yours. 

Because this study is important, your answers are also important. Please answer 
each question carefully. 

Thank you for your help. 



Name of school or organization 
where course was taken i 



Teacher ""s namei 



Your birth date^ Month ^ ^_ Day _ 

Your sex (Check one) i Male _^ Praale 

Your grade level in school (Check one)i 9 

10 
11 
12 



177 



er|c 



Now that your teenager's sex education course is over, we are interested in your 
ideas about whether it changed hln or her, ?or each question, please circle the 
nimber that best describes your opinion. If you don^t know, circle DK# 



1. 

2. 



5. 



1 ^ Much Less 

2 ^ Somewhat Less 

3 ^ About the Bamm 

4 ^ Somewhat More 

5 ^ Much More 
DK = Don^t Know 



Does your teenager know less or more about sexuality? 

^e your teenager's attitudes and values about 
sexuality less or more clear? 



3- Are you less or more comtortable talking about 
sexuality with your teenager? 

4, Have you actual iy talked about sexuality with your 
teenager less or more? 



Does your teenager talk and listen to you about 
sexuality less or more effectively ? That is^ 
is your teenager less or more able to talk about 
thoughts, feelings^ and needs, and to listen 
carefully? 



6. Is your teenager less or more likely to make good 
decisions about social and sexual behavior? That 
is 5 is your teenager less or more able to exTOine 
alternatives and consider consequences? 

7» Is your teenager less or more likely to have ^iex 
soon because of this course? 





(0 


i 










n 




u 








s 




& 










m 




fli 


E 


m 


4^ 










0) 




44 


m 












•§ 












ai 






u 


i 


O 


B 


o 


m 


i 


o 




o 




Q 








OT 


1 


Q 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 



1 2 3 4 5 DK 



1 2 3 4 5 DK 



178 



17& 



These five questioiis should be answered using another 5-point seale. Again, if you 
don^t knowi circle DK# 

1 ^ Very Poor 

2 ^ Poor 

3 ^ Average 

4 ^ Good 

5 ^ iMellent 
DK ^ Don't Know 



7* What is your Opinion of the teacher? 

8* What is your opinion of the topics covered in the 
course? 

9* What is your opinion of the materials used, such as 
books and films? 

10. What is your opinion of the organization and format 
of the program, such as lengthy location, and time? 

11* What is your opinion of the overall program? 

12 • What things about the progrra did you particularly like? 













1 


M 












O 








m 




O 




0) 




0) 




Pm 














m 




1-1 






u 


u 








u 


o 


m 


a 


U 






a 




Q 




o 














1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 


1 


2 


3 


4 


5 


DK 



13, What things about the program do you think should be changed? How? 



Thank you for completing the questionnaire. 



179179 



